Centennial Man: 2024-08-11

Saturday, August 17, 2024

Indicating that AI is now legal?

https://www.law.berkeley.edu/article/berkeley-law-unveils-groundbreaking-ai-law-degree-program/

Berkeley Law Unveils Groundbreaking AI-Focused Law Degree Program

Berkeley Law, renowned for its innovative legal education and leadership in law and technology, is proud to announce the launch of the first-ever law degree with a focus on artificial intelligence (AI). Set to begin in summer 2025, the AI-focused Master of Laws (LL.M.) degree is now open for applications.

Maury Nichols pointed this one out to me. He knows I need all the help I can get…

https://pro.bloomberglaw.com/insights/privacy/privacy-laws-us-vs-eu-gdpr/?utm_source=sfmc&utm_campaign=BLAW%7eNA%7eAwareness%7eRPT%7eBLAW22110285%7eA0_State_Privacy_Chart%7eInHouse%7eEM_4%7eFirstParty%7eNA%7e20240301&utm_term=Privacy_Hyperlink_Body&trackingcode=BLAW22110285&id_mc=28185495&utm_content=30065&utm_id=5eb11d52-97cf-4103-981a-6845bf2886b4&sfmc_id=28185495&sfmc_activityid=0dbc1d3a-3350-4ed8-b931-a26329440749&utm_medium=email

Comparing U.S. State Data Privacy Laws vs. the EU’s GDPR

U.S. consumer data privacy laws have much in common – both with each other and with the laws from which they took their inspiration – but subtle differences may trip up even the most seasoned compliance professionals. Here, Bloomberg Law provides an easy-to-read comparison of the EU’s General Data Protection Regulation (GDPR) against the first three data privacy laws in the U.S: California, Virginia, and Colorado.

[Download the full chart for all the critical information at a glance.]

Tools & Techniques.

https://www.howtogeek.com/ai-tools-to-analyze-pdfs-for-free/

5 AI Tools to Analyze PDFs For Free

While various third-party AI tools offer PDF analysis capabilities, some come with a price tag, and others may not deliver accurate results. Why not just use the popular AI chatbot tools to analyze PDFs? These tools offer PDF upload features and are free to use.

Friday, August 16, 2024

I suspect there is a way to take advantage of this “weakness.”

https://www.bloomberg.com/news/articles/2024-08-15/google-s-search-dominance-leaves-sites-little-choice-on-ai-scraping?accessToken=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzb3VyY2UiOiJTdWJzY3JpYmVyR2lmdGVkQXJ0aWNsZSIsImlhdCI6MTcyMzcyOTM1MywiZXhwIjoxNzI0MzM0MTUzLCJhcnRpY2xlSWQiOiJTSTlHNEREV0xVNjgwMCIsImJjb25uZWN0SWQiOiJGQTUwMzI1N0EwREI0MkNCOTNGM0YyODVENEIzRDNCMSJ9.lVM70e8MPDcqlkA2TV5xPLIagjVBkBT-KCz8kq6_ZGc

Google’s AI Search Gives Sites Dire Choice: Share Data or Die

Publishers say blocking the company’s AI bot could also prevent their sites from showing up in search

Google now displays convenient artificial intelligence-based answers at the top of its search pages — meaning users may never click through to the websites whose data is being used to power those results. But many site owners say they can’t afford to block Google’s AI from summarizing their content.

That’s because the Google tool that sifts through web content to come up with its AI answers is the same one that keeps track of web pages for search results, according to publishers. Blocking Alphabet Inc.’s Google the way sites have blocked some of its AI competitors would also hamper a site’s ability to be discovered online.

Google’s dominance in search — which a federal court ruled last week is an illegal monopoly — is giving it a decisive advantage in the brewing AI wars, which search startups and publishers say is unfair as the industry takes shape. The dilemma is particularly acute for publishers, which face a choice between offering up their content for use by AI models that could make their sites obsolete and disappearing from Google search, a top source of traffic.

Tools & Techniques.

https://www.howtogeek.com/duckduckgos-new-ai-chat-is-the-best-way-to-use-chatgpt/

DuckDuckGo's New AI Chat is the Best Way to use ChatGPT

DuckDuckGo's AI chatbot promises the same AI chatbot features as popular large language models, with none of the creepy privacy issues. Here's how to make the most of this feature.

… Unlike Meta's AI, OpenAI's ChatGPT, or Google's Gemini, the DuckDuckGo AI chat offers users access to a multitude of large language models, including both closed-source and open-source options. Currently, users can engage with ChatGPT 3.5, ChatGPT 4.0-mini, Llama 3, Claude, and Mixtral.

Best of all, models like Llama 3 and Mixtral, which are open-source, typically require significant high-end hardware resources. However, the DuckDuckGo AI chat tool enables anyone to run these robust, open-source models directly in their browser without needing to configure anything.

Thursday, August 15, 2024

If you were going to run against someone, here’s a resource.

https://www.bespacific.com/what-elected-officials-say-and-do/

What Elected Officials Say and Do

Polarization Research Lab – America’s Political Pulse – Resources and data to understand and halt the growth of partisan animosity. What Elected Officials Say and Do – Explore data on the speech, effectiveness, and campaign support for elected U.S. legislators.

Using AI to assess the rhetoric of all 535 legislators in the House and Senate.
Pulling data from speeches, newsletters, tweets, and public statements.
1,531,223 data points processed and tagged so far; updated daily.

What Voters Think about Partisan Animosity – Explore weekly survey data on partisan hatred, support for democratic norm violations, and support for partisan violence.

Ongoing since September 2022.
96 consecutive weeks of data.
109,000 total responses from 67,588 unique survey takers.
Complete dataset open to the public. Download

A clever starting point?

https://www.bespacific.com/ai-risk-repository/

AI Risk Repository

What are the risks from Artificial Intelligence? A comprehensive living database of over 700 AI risks categorized by their cause and risk domain.

What is the AI Risk Repository? The AI Risk Repository has three parts:

The AI Risk Database captures 700+ risks extracted from 43 existing frameworks, with quotes and page numbers.
The Causal Taxonomy of AI Risks classifies how, when, and why these risks occur.
The Domain Taxonomy of AI Risks classifies these risks into seven domains (e.g., “Misinformation”) and 23 subdomains (e.g., “False or misleading information”).

How can I use the Repository? The AI Risk Repository provides:

An accessible overview of the AI risk landscape.
A regularly updated source of information about new risks and research.
A common frame of reference for researchers, developers, businesses, evaluators, auditors, policymakers, and regulators.
A resource to help develop research, curricula, audits, and policy.
An easy way to find relevant risks and research.

So how much money do they owe you?

https://www.bespacific.com/has-your-paper-been-used-to-train-an-ai-model-almost-certainly/

Has your paper been used to train an AI model? Almost certainly

Nature – Artificial-intelligence developers are buying access to valuable data sets that contain research papers — raising uncomfortable questions about copyright. “Academic publishers are selling access to research papers to technology firms to train artificial-intelligence (AI) models. Some researchers have reacted with dismay at such deals happening without the consultation of authors. The trend is raising questions about the use of published and sometimes copyrighted work to train the exploding number of AI chatbots in development. Experts say that, if a research paper hasn’t yet been used to train a large language model (LLM), it probably will be soon. Researchers are exploring technical ways for authors to spot if their content being used. AI models fed AI-generated data quickly spew nonsense. Last month, it emerged that the UK academic publisher Taylor & Francis, had signed a US$10-million deal with Microsoft, allowing the US technology company to access the publisher’s data to improve its AI systems. And in June, an investor update showed that US publisher Wiley had earned $23 million from allowing an unnamed company to train generative-AI models on its content. Anything that is available to read online — whether in an open-access repository or not — is “pretty likely” to have been fed into an LLM already, says Lucy Lu Wang, an AI researcher at the University of Washington in Seattle. “And if a paper has already been used as training data in a model, there’s no way to remove that paper after the model has been trained,” she adds…”

Probably inevitable.

https://www.reuters.com/world/kim-dotcom-be-extradited-new-zealand-after-12-year-fight-with-us-2024-08-15/

Kim Dotcom to be extradited from New Zealand after 12-year fight with US

Kim Dotcom, who is facing criminal charges relating to the defunct file-sharing website Megaupload, will be extradited to the United States from New Zealand, the New Zealand justice minister said on Thursday.

German-born Dotcom, who has New Zealand residency, has been fighting extradition to the United States since 2012 following a FBI-ordered raid on his Auckland mansion.

Resources. Because you never know when speaking Klingon might come in handy.

https://www.makeuseof.com/ai-powered-language-learning-apps/

5 AI-Powered Language Learning Apps Worth Trying

Are you looking to learn a new language with a minimum of effort? Well, AI-powered language-learning apps can make that dream a reality. The following apps can help transform language learning into a fun and effective process.

Wednesday, August 14, 2024

I still don’t get it. If police gather fingerprints at a crime scene, is that search overbroad because there might be prints from people other than the criminal? I would argue that the ability to search large volumes of data is a plus not a negative. Must police know who they are looking for before they can take fingerprints?

https://techcrunch.com/2024/08/13/us-appeals-court-rules-geofence-warrants-are-unconstitutional/

US appeals court rules geofence warrants are unconstitutional

… But critics have long argued that geofence warrants are unconstitutional because they can be overbroad and include information on entirely innocent people.

… But because the bank of data is so big, and because the entire database has to be scanned, the court ruled that there is no legal authority capable of authorizing a search, per a blog post by law professor Orin Kerr analyzing the ruling.

… The court said in its ruling, its emphasis included: “This search is occurring while law enforcement officials have no idea who they are looking for, or whether the search will even turn up a result.

… Kerr, in his analysis, said the ruling “raises questions of whether any digital warrants for online contents are constitutional.”

Tools & Techniques. (Very Mission Impossible)

https://arstechnica.com/information-technology/2024/08/new-ai-tool-enables-real-time-face-swapping-on-webcams-raising-fraud-concerns/

Deep-Live-Cam goes viral, allowing anyone to become a digital doppelganger

Over the past few days, a software package called Deep-Live-Cam has been going viral on social media because it can take the face of a person extracted from a single photo and apply it to a live webcam video source while following pose, lighting, and expressions performed by the person on the webcam. While the results aren't perfect, the software shows how quickly the tech is developing—and how the capability to deceive others remotely is getting dramatically easier over time.

Tuesday, August 13, 2024

Isn’t “out into the world” the same as “public?” If someone saw an individual at a crime scene when the crime was committed, that would be admissible. “Time shifting” by relying on geolocation data (or videos or even fingerprints) seems to me to be the same.

https://www.eff.org/deeplinks/2024/08/federal-appeals-court-finds-geofence-warrants-are-categorically-unconstitutional

Federal Appeals Court Finds Geofence Warrants Are “Categorically” Unconstitutional

In a major decision on Friday, the federal Fifth Circuit Court of Appeals held that geofence warrants are “categorically prohibited by the Fourth Amendment.” Closely following arguments EFF has made in a number of cases, the court found that geofence warrants constitute the sort of “general, exploratory rummaging” that the drafters of the Fourth Amendment intended to outlaw. EFF applauds this decision because it is essential that every person feels like they can simply take their cell phone out into the world without the fear that they might end up a criminal suspect because their location data was swept up in open-ended digital dragnet.

This may important later…

https://www.bespacific.com/the-files-are-in-the-computer-on-copyright-memorization-and-generative-ai/

The Files are in the Computer: On Copyright, Memorization, and Generative AI

Cooper, A. Feder and Grimmelmann, James and Grimmelmann, James, The Files are in the Computer: On Copyright, Memorization, and Generative AI (April 22, 2024). Cornell Legal Studies Research Paper Forthcoming, Chicago-Kent Law Review, Forthcoming, Available at SSRN: https://ssrn.com/abstract=4803118 – “The New York Times’s copyright lawsuit against OpenAI and Microsoft alleges that OpenAI’s GPT models have “memorized” Times articles. Other lawsuits make similar claims. But parties, courts, and scholars disagree on what memorization is, whether it is taking place, and what its copyright implications are. Unfortunately, these debates are clouded by deep ambiguities over the nature of “memorization,” leading participants to talk past one another. In this Essay, we attempt to bring clarity to the conversation over memorization and its relationship to copyright law. Memorization is a highly active area of research in machine learning, and we draw on that literature to provide a firm technical foundation for legal discussions. The core of the Essay is a precise definition of memorization for a legal audience. We say that a model has “memorized” a piece of training data when (1) it is possible to reconstruct from the model (2) a near-exact copy of (3) a substantial portion of (4) that specific piece of training data. We distinguish memorization from “extraction” (in which a user intentionally causes a model to generate a near-exact copy), from “regurgitation” (in which a model generates a near-exact copy, regardless of the user’s intentions), and from “reconstruction” (in which the near-exact copy can be obtained from the model by any means, not necessarily the ordinary generation process). Several important consequences follow from these definitions. First, not all learning is memorization: much of what generative-AI models do involves generalizing from large amounts of training data, not just memorizing individual pieces of it. Second, memorization occurs when a model is trained; it is not something that happens when a model generates a regurgitated output. Regurgitation is a symptom of memorization in the model, not its cause. Third, when a model has memorized training data, the model is a “copy” of that training data in the sense used by copyright law. Fourth, a model is not like a VCR or other general-purpose copying technology; it is better at generating some types of outputs (possibly including regurgitated ones) than others. Fifth, memorization is not just a phenomenon that is caused by “adversarial” users bent on extraction; it is a capability that is latent in the model itself. Sixth, the amount of training data that a model memorizes is a consequence of choices made in the training process; different decisions about what data to train on and how to train on it can affect what the model memorizes. Seventh, system design choices also matter at generation time. Whether or not a model that has memorized training data actually regurgitates that data depends on the design of the overall system: developers can use other guardrails to prevent extraction and regurgitation. In a very real sense, memorized training data is in the model—to quote Zoolander, the files are in the computer.”

Monday, August 12, 2024

If public records must be private what about the reverse?

https://thehill.com/opinion/technology/4820294-ai-data-public-records-privacy/

Public records data must be off-limits for AI

Companies are looking for innovative ways to collect data to feed their data-hungry artificial intelligence systems and create innovative applications. Fortunately, some do not have to go too far.

Then, there are the companies that collect data from public records to share on the internet and run analytics.

… First, public records are neither free from biases nor representative. Outcomes from the systems trained on such data are not likely to be entirely fair.

How to abuse AI for fun and profit?

https://www.schneier.com/blog/archives/2024/08/taxonomy-of-generative-ai-misuse.html

Taxonomy of Generative AI Misuse

Interesting paper: “Generative AI Misuse: A Taxonomy of Tactics and Insights from Real-World Data”:

Generative, multimodal artificial intelligence (GenAI) offers transformative potential across industries, but its misuse poses significant risks. Prior research has shed light on the potential of advanced AI systems to be exploited for malicious purposes. However, we still lack a concrete understanding of how GenAI models are specifically exploited or abused in practice, including the tactics employed to inflict harm. In this paper, we present a taxonomy of GenAI misuse tactics, informed by existing academic literature and a qualitative analysis of approximately 200 observed incidents of misuse reported between January 2023 and March 2024. Through this analysis, we illuminate key and novel patterns in misuse during this time period, including potential motivations, strategies, and how attackers leverage and abuse system capabilities across modalities (e.g. image, text, audio, video) in the wild.

Blog post. Note the graphic mapping goals with strategies.

Perspective.

https://futurecio.tech/the-2024-gartner-ai-hype-cycle/

The 2024 Gartner AI Hype Cycle

Chris Howard, Gartner's chief of research, explains the Hype Cycle for Artificial Intelligence, saying that AI continues to be one of the most dominant topics Gartner covers.

"There's a trigger that happens, and people become interested in this technology; it reaches a Peak of Inflated Expectations and eventually comes off of that peak into the Tropic Disillusionment, where the work happens. Then, we might reach the plateau of productivity. And the AI Hype Cycle is how fast things are moving through it, partially because of the amount of investment that's been happening over the last year and a half, two years. Now, GenAI, as you're probably feeling, generative AI, is past the Peak of Inflated Expectations and starting to go into the trough," he explained.

He further explained that the Trough of Disillusionment is not dark or dangerous, but rather "the place where we figure out how to make something work or not, what it is, or not, where the hard work takes place, where the other dependencies are and so on."

American researcher and professor Donald Norman said, "Things get easier to use by becoming more complex inside."

Sunday, August 11, 2024

A win for security is a loss for privacy?

https://apnews.com/article/united-nations-cybercrime-computer-technology-39fe999d78f615912d0bdb2011290665

The UN is moving to fight cybercrime but privacy groups say human rights will be violated

A global deal on the criminal use of computer technology is moving ahead despite worries it will let governments around the world violate human rights by probing electronic communications and bypassing privacy safeguards.

Nearly 200 nations approved the United Nations Convention against Cybercrime on Thursday afternoon at a special committee meeting that capped months of complicated negotiations. The treaty — expected to win General Assembly approval within months — creates a framework for nations to cooperate against internet-related crimes including the illegal access and interception of computer information; electronic eavesdropping and online child sex abuse.

… Many cited examples of probable downsides like the case against Rappler, an online Philippine news outlet that angered former President Rodrigo Duterte by reporting critically on his deadly crackdown on illegal drugs and alarming human rights record. Founded by 2021 Nobel Peace Prize co-winner Maria Ressa, libertarians said the site is the type that will become vulnerable around the world thanks to the new treaty but advocates including the Biden administration said the deal reflects the interests of the U.S. and its allies.

It balances privacy concerns with the need for every country to pursue criminal activity around the world, the Biden administration said.

Perspective.

https://www.tandfonline.com/doi/abs/10.1080/08989621.2024.2386285

Is AI my co-author? The ethics of using artificial intelligence in scientific publishing

The recent emergence of Large Language Models (LLMs) and other forms of Artificial Intelligence (AI) has led people to wonder whether they could act as an author on a scientific paper. This paper argues that AI systems should not be included on the author by-line. We agree with current commentators that LLMs are incapable of taking responsibility for their work and thus do not meet current authorship guidelines. We identify other problems with responsibility and authorship. In addition, the problems go deeper as AI tools also do not write in a meaningful sense nor do they have persistent identities. From a broader publication ethics perspective, adopting AI authorship would have detrimental effects on an already overly competitive and stressed publishing ecosystem. Deterrence is possible as backward-looking tools will likely be able to identify past AI usage. Finally, we question the value of using AI to produce more research simply for publication’s sake.

Centennial Man

Saturday, August 17, 2024

Friday, August 16, 2024

Thursday, August 15, 2024

Wednesday, August 14, 2024

Tuesday, August 13, 2024

Monday, August 12, 2024

Sunday, August 11, 2024

Search This Blog

Followers

Blog Archive

Links

About Me