Connect with us

The Conversation

AI companies train language models on YouTube’s archive − making family-and-friends videos a privacy risk

Published

on

theconversation.com – Ryan McGrady, Senior Researcher, Initiative for Digital Public Infrastructure, UMass Amherst – 2024-06-27 07:23:53

Your kid’s silly video could be fodder for ChatGPT.
Halfpoint/iStock via Getty Images

Ryan McGrady, UMass Amherst and Ethan Zuckerman, UMass Amherst

The promised artificial intelligence revolution requires data. Lots and lots of data. OpenAI and Google have begun using YouTube videos to train their text-based AI models. But what does the YouTube archive actually include?

Our team of digital media researchers at the University of Massachusetts Amherst collected and analyzed random samples of YouTube videos to learn more about that archive. We published an 85-page paper about that dataset and set up a website called TubeStats for researchers and journalists who need basic information about YouTube.

Now, we’re taking a closer look at some of our more surprising findings to better understand how these obscure videos might become part of powerful AI systems. We’ve found that many YouTube videos are meant for personal use or for small groups of people, and a significant proportion were created by children who appear to be under 13.

Bulk of the YouTube iceberg

Most people’s experience of YouTube is algorithmically curated: Up to 70% of the videos users watch are recommended by the site’s algorithms. Recommended videos are typically popular content such as influencer stunts, news clips, explainer videos, travel vlogs and video game reviews, while content that is not recommended languishes in obscurity.

Some YouTube content emulates popular creators or fits into established genres, but much of it is personal: family celebrations, selfies set to music, homework assignments, video game clips without context and kids dancing. The obscure side of YouTube – the vast majority of the estimated 14.8 billion videos created and uploaded to the platform – is poorly understood.

Illuminating this aspect of YouTube – and social media generally – is difficult because big tech companies have become increasingly hostile to researchers.

We’ve found that many videos on YouTube were never meant to be shared widely. We documented thousands of short, personal videos that have few views but high engagement – likes and comments – implying a small but highly engaged audience. These were clearly meant for a small audience of friends and family. Such social uses of YouTube contrast with videos that try to maximize their audience, suggesting another way to use YouTube: as a video-centered social network for small groups.

Other videos seem intended for a different kind of small, fixed audience: recorded classes from pandemic-era virtual instruction, school board meetings and work meetings. While not what most people think of as social uses, they likewise imply that their creators have a different expectation about the audience for the videos than creators of the kind of content people see in their recommendations.

Fuel for the AI machine

It was with this broader understanding that we read The New York Times exposé on how OpenAI and Google turned to YouTube in a race to find new troves of data to train their large language models. An archive of YouTube transcripts makes an extraordinary dataset for text-based models.

There is also speculation, fueled in part by an evasive answer from OpenAI’s chief technology officer Mira Murati, that the videos themselves could be used to train AI text-to-video models such as OpenAI’s Sora.

The New York Times story raised concerns about YouTube’s terms of service and, of course, the copyright issues that pervade much of the debate about AI. But there’s another problem: How could anyone know what an archive of more than 14 billion videos, uploaded by people all over the world, actually contains? It’s not entirely clear that Google knows or even could know if it wanted to.

Kids as content creators

We were surprised to find an unsettling number of videos featuring kids or apparently created by them. YouTube requires uploaders to be at least 13 years old, but we frequently saw children who appeared to be much younger than that, typically dancing, singing or playing video games.

In our preliminary research, our coders determined nearly a fifth of random videos with at least one person’s face visible likely included someone under 13. We didn’t take into account videos that were clearly shot with the consent of a parent or guardian.

Our current sample size of 250 is relatively small – we are working on coding a much larger sample – but the findings thus far are consistent with what we’ve seen in the past. We’re not aiming to scold Google. Age validation on the internet is infamously difficult and fraught, and we have no way of determining whether these videos were uploaded with the consent of a parent or guardian. But we want to underscore what is being ingested by these large companies’ AI models.

Small reach, big influence

It’s tempting to assume OpenAI is using highly produced influencer videos or TV newscasts posted to the platform to train its models, but previous research on large language model training data shows that the most popular content is not always the most influential in training AI models. A virtually unwatched conversation between three friends could have much more linguistic value in training a chatbot language model than a music video with millions of views.

Unfortunately, OpenAI and other AI companies are quite opaque about their training materials: They don’t specify what goes in and what doesn’t. Most of the time, researchers can infer problems with training data through biases in AI systems’ output. But when we do get a glimpse at training data, there’s often cause for concern. For example, Human Rights Watch released a report on June 10, 2024, that showed that a popular training dataset includes many photos of identifiable kids.

The history of big tech self-regulation is filled with moving goal posts. OpenAI in particular is notorious for asking for forgiveness rather than permission and has faced increasing criticism for putting profit over safety.

Concerns over the use of user-generated content for training AI models typically center on intellectual property, but there are also privacy issues. YouTube is a vast, unwieldy archive, impossible to fully review.

Models trained on a subset of professionally produced videos could conceivably be an AI company’s first training corpus. But without strong policies in place, any company that ingests more than the popular tip of the iceberg is likely including content that violates the Federal Trade Commission’s Children’s Online Privacy Protection Rule, which prevents companies from collecting data from children under 13 without notice.

With last year’s executive order on AI and at least one promising proposal on the table for comprehensive privacy legislation, there are signs that legal protections for user data in the U.S. might become more robust.

YouTube video
When the Wall Street Journal’s Joanna Stern asked OpenAI CTO Mira Murati whether OpenAI trained its text-to-video generator Sora on YouTube videos, she said she wasn’t sure.

Have you unwittingly helped train ChatGPT?

The intentions of a YouTube uploader simply aren’t as consistent or predictable as those of someone publishing a book, writing an article for a magazine or displaying a painting in a gallery. But even if YouTube’s algorithm ignores your upload and it never gets more than a couple of views, it may be used to train models like ChatGPT and Gemini.

As far as AI is concerned, your family reunion video may be just as important as those uploaded by influencer giant Mr. Beast or CNN.The Conversation

Ryan McGrady, Senior Researcher, Initiative for Digital Public Infrastructure, UMass Amherst and Ethan Zuckerman, Associate Professor of Public Policy, Communication, and Information, UMass Amherst

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Read More

The post AI companies train language models on YouTube’s archive − making family-and-friends videos a privacy risk appeared first on theconversation.com

The Conversation

Measles can ravage the immune system and brain, causing long-term damage – a virologist explains

Published

on

theconversation.com – Peter Kasson, Professor of Chemistry and Biomedical Engineering, Georgia Institute of Technology – 2025-03-31 07:16:00

Measles infections send 1 in 5 people to the hospital.
wildpixel/ iStock via Getty Images Plus

Peter Kasson, Georgia Institute of Technology

The measles outbreak that began in west Texas in late January 2025 continues to grow, with 400 confirmed cases in Texas and more than 50 in New Mexico and Oklahoma as of March 28.

Public health experts believe the numbers are much higher, however, and some worry about a bigger resurgence of the disease in the U.S. In the past two weeks, health officials have identified potential measles exposures in association with planes, trains and automobiles, including at Washington Dulles International Airport and on an Amtrak train from New York City to Washington, D.C. – as well as at health care facilities where the infected people sought medical attention.

Measles infections can be extremely serious. So far in 2025, 14% of the people who got measles had to be hospitalized. Last year, that number was 40%. Measles can damage the lungs and immune system, and also inflict permanent brain damage. Three in 1,000 people who get the disease die. But because measles vaccination programs in the U.S. over the past 60 years have been highly successful, few Americans under 50 have experienced measles directly, making it easy to think of the infection as a mere childhood rash with fever.

As a biologist who studies how viruses infect and kill cells and tissues, I believe it is important for people to understand how dangerous a measles infection can be.

Underappreciated acute effects

Measles is one of the most contagious diseases on the planet. One person who has it will infect nine out of 10 people nearby if those people are unvaccinated. A two-dose regimen of the vaccine, however, is 97% effective at preventing measles.

When the measles virus infects a person, it binds to specific proteins on the surface of cells. It then inserts its genome and replicates, destroying the cells in the process. This first happens in the upper respiratory tract and the lungs, where the virus can damage the person’s ability to breathe well. In both places, the virus also infects immune cells that carry it to the lymph nodes, and from there, throughout the body.

YouTube video
Measles can wipe out immune cells’ ability to recognize pathogens.

What generally lands people with measles in the hospital is the disease’s effects on the lungs. As the virus destroys lung cells, patients can develop viral pneumonia, which is characterized by severe coughing and difficulty breathing. Measles pneumonia afflicts about 1 in 20 children who get measles and is the most common cause of death from measles in young children.

The virus can directly invade the nervous system and also damage it by causing inflammation. Measles can cause acute brain damage in two different ways: a direct infection of the brain that occurs in roughly 1 in 1,000 people, or inflammation of the brain two to 30 days after infection that occurs with the same frequency. Children who survive these events can have permanent brain damage and impairments such as blindness and hearing loss.

Yearslong consequences of infection

An especially alarming but still poorly understood effect of measles infection is that it can reduce the immune system’s ability to recognize pathogens it has previously encountered. Researchers had long suspected that children who get the measles vaccine also tend to have better immunity to other diseases, but they were not sure why. A study published in 2019 found that having a measles infection destroyed between 11% and 75% of their antibodies, leaving them vulnerable to many of the infections to which they previously had immunity. This effect, called immune amnesia, lasts until people are reinfected or revaccinated against each disease their immune system forgot.

Occasionally, the virus can lie undetected in the brain of a person who recovered from measles and reactivate typically seven to 10 years later. This condition, called subacute sclerosing panencephalitis, is a progressive dementia that is almost always fatal. It occurs in about 1 in 25,000 people who get measles but is about five times more common in babies infected with measles before age 1.

Researchers long thought that such infections were caused by a special strain of measles, but more recent research suggests that the measles virus can acquire mutations that enable it to infect the brain during the course of the original infection.

There is still much to learn about the measles virus. For example, researchers are exploring antibody therapies to treat severe measles. However, even if such treatments work, the best way to prevent the serious effects of measles is to avoid infection by getting vaccinated.The Conversation

Peter Kasson, Professor of Chemistry and Biomedical Engineering, Georgia Institute of Technology

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Read More

The post Measles can ravage the immune system and brain, causing long-term damage – a virologist explains appeared first on theconversation.com

Continue Reading

The Conversation

Supreme Court considers whether states may prevent people covered by Medicaid from choosing Planned Parenthood as their health care provider

Published

on

theconversation.com – Naomi Cahn, Professor of Law, University of Virginia – 2025-04-02 17:04:00

Planned Parenthood clinics, like this one in Los Angeles, are located across the United States.
Patrick T. Fallon/AFP via Getty Images

Naomi Cahn, University of Virginia and Sonia Suter, George Washington University

Having the freedom to choose your own health care provider is something many Americans take for granted. But the Supreme Court is weighing whether people who rely on Medicaid for their health insurance have that right, and if they do – is it enforceable by law?

That’s the key question at the heart of a case, Medina v. Planned Parenthood South Atlantic, that began during President Donald Trump’s first term in office.

“There’s a right, and the right is the right to choose your doctor,” said Justice Elena Kagan on April 2, 2025, during oral arguments on the case. John J. Bursch, the Alliance Defending Freedom lawyer who is representing South Carolina Director of Health and Human Services Eunice Medina, countered that none of the words in the underlying statute had what he called a “rights-creating pedigree.”

As law professors who teach courses about health and poverty law as well as reproductive justice, we think this case could affect access to health care for 72 million Americans, including low-income people and their children and people with disabilities.

Excluding Planned Parenthood

The case started with Julie Edwards, who is enrolled in Medicaid and lives in South Carolina. After she struggled to get contraceptive services, she was able to receive care from a Planned Parenthood South Atlantic clinic in Columbia, South Carolina.

Planned Parenthood, an array of nonprofits with roots that date back more than a century, is among the nation’s top providers of reproductive services. It operates two clinics in South Carolina, where Medicaid patients can get physical exams, cancer screenings, contraception and other services. It also provides same-day appointments and keeps long hours.

In July 2018, however, South Carolina Gov. Henry McMaster issued an executive order that barred health care providers in South Carolina that offer abortions from reimbursement through Medicaid.

That meant Planned Parenthood, a longtime target of conservatives’ ire, would no longer be reimbursed for any type of care for Medicaid patients, preventing Edwards from transferring all her gynecological care to that office as she had hoped to do.

Planned Parenthood and Edwards sued South Carolina, claiming that the state was violating the federal Medicare and Medicaid Act, which Congress passed in 1965, by not letting Edwards obtain care from the provider of her choice.

A ‘free-choice-of-provider’ requirement

Medicaid operates as a partnership between the federal government and the states. Congress passed the law that led to its creation based on its power under the Constitution’s spending clause, which allows Congress to subject federal funds to certain requirements.

Two years later, due to concerns that states were restricting which providers Medicaid recipients could choose, Congress added a “free-choice-of-provider” requirement to the program. It states that people enrolled in Medicaid “may obtain such assistance from any institution, agency, community pharmacy, or person, qualified to perform the service or services required.”

This provision is at the core of this case. At issue is whether a civil rights statute provides a right for Medicaid beneficiaries to sue a state when their federal rights have been violated. Known as Section 1983, it was enacted in 1871.

Bursch, backed by the Trump administration, argued before the court that the absence of words like “right” in the Medicaid provision that requires states to provide a free choice of provider means that neither Edwards nor Planned Parenthood has the authority to file a lawsuit to enforce this aspect of the Medicaid statute.

Nicole A. Saharsky, Planned Parenthood’s lawyer, argued that the creation of a right shouldn’t depend on “some kind of magic words test.” Instead, she said it was clear that the Medicaid statute created “a right to choose their own doctor” because “it’s mandatory” that the state provide this option to everyone with health insurance through Medicaid.

She also emphasized that Congress wanted to protect “an intensely personal right” to be able “to choose your doctor, the person that you see when you’re at your most vulnerable, facing … some of the most significant … challenges to your life and your health.”

Restricting Medicaid funds

Through a federal law known as the Hyde Amendment, Medicaid cannot reimburse health care providers for the cost of abortions, with a few exceptions: when a patient’s life is at risk or her pregnancy is due to rape or incest. Some states do cover abortion when their laws allow it, without using any federal funds.

Therefore, Planned Parenthood only gets federal Medicaid funds for abortions in those limited circumstances.

McMaster explained that he removed “abortion clinics,” including Planned Parenthood, from the South Carolina Medicaid Program because he didn’t want state funds to indirectly subsidize abortions.

South Carolina “decided that Planned Parenthood was unqualified for many reasons, chiefly because they’re the nation’s largest abortion provider,” Bursch told the Supreme Court.

But only 3% of Planned Parenthood’s services nationwide last year were related to abortion. Its most common service is testing for sexually transmitted diseases. Across the nation, Planned Parenthood provides health care to more than 2 million patients per year, most of whom have low incomes.

Man with gray hair in a suit and red tie speaks at a podium.
South Carolina Gov. Henry McMaster speaks to a crowd during an election night party on Nov. 3, 2020, in Columbia.
Photo by Sean Rayford/Getty Images

Section 1983

Because the Medicaid statute itself does not allow an individual to sue, Edwards and Planned Parenthood are relying on Section 1983.

Lower courts have repeatedly upheld that the Medicaid statute provides Edwards with the right to obtain Medicaid-funded health care at her local Planned Parenthood clinic.

And the Supreme Court has long recognized that Section 1983 protects an individual’s ability to sue when their rights under a federal statute have been violated.

In 2023, for example, the court found such a right under the Medicaid Nursing Home Reform Act. The court held that Section 1983 confers the right to sue when a statute’s provisions “unambiguously confer individual federal rights.”

Consequences beyond South Carolina

The court’s decision in the Medina case on whether Medicaid patients can choose their own health care provider could have consequences far beyond South Carolina. Arkansas, Missouri and Texas have already barred Planned Parenthood from getting reimbursed by Medicaid for any kind of health care. More states could follow suit.

In addition, given Planned Parenthood’s role in providing expansive contraceptive care, disqualifying it from Medicaid could harm access to health care and increase the already-high unintended pregnancy rate in America.

The ramifications, likewise, could extend beyond the finances of Planned Parenthood.

If the court rules in South Carolina’s favor, states could also try to exclude providers based on other characteristics, such as whether their employees belong to unions or if they provide their patients with gender-affirming care, further restricting patients’ choices.

Or, as Kagan observed, states could go the opposite direction and exclude providers that don’t provide abortions and so forth. What’s really at stake, she said, is whether a patient is “entitled to see” the provider they choose regardless of what their state happens to “think about contraception or abortion or gender transition treatment.”

If the Supreme Court rules that Edwards does have a right to get health care at a Planned Parenthood clinic, the controversy would not be over. The lower courts would then have to decide whether South Carolina appropriately removed Planned Parenthood from Medicaid as an “unqualified provider.”

And if the Supreme Court rules in favor of South Carolina, then Planned Parenthood could still sue South Carolina over its decision to find them to be unqualified.The Conversation

Naomi Cahn, Professor of Law, University of Virginia and Sonia Suter, Professor of Law, George Washington University

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Read More

The post Supreme Court considers whether states may prevent people covered by Medicaid from choosing Planned Parenthood as their health care provider appeared first on theconversation.com

Continue Reading

The Conversation

Feeling FOMO for something that’s not even fun? It’s not the event you’re missing, it’s the bonding

Published

on

theconversation.com – Jacqueline Rifkin, Assistant Professor of Marketing, Cornell University – 2025-04-02 07:48:00

They had so much fun without me.
Milko/E+ via Getty Images

Jacqueline Rifkin, Cornell University; Barbara Kahn, University of Pennsylvania, and Cindy Chan, University of Toronto

Imagine you’ve planned the trip of a lifetime for your animal-loving family: a cruise to Antarctica with the unique opportunity to view penguins, whales and other rare wildlife. Your adventure-loving kids can kayak through fjords, plunge into icy water and camp under the Antarctic sky.

But rather than being ecstatic, as you anticipated, your kids whine about skipping an after-school scout meeting at a neighbor’s house. Missing this ordinary weekly event triggers such intense FOMO – “fear of missing out” – for them that they don’t want to go on your amazing expedition.

If this kind of debacle sounds familiar to you – or at least if you find it perplexing – you’re not alone. The three of us are marketing professors and social psychologists who focus on how consumers make decisions and how this shapes well-being. We’ve been studying FOMO for over a decade and recently published our work in the Journal of Personality and Social Psychology. Over the years, we’ve learned what really drives intense feelings of FOMO – which explains why a run-of-the-mill meeting might feel more crucial than an over-the-top vacation.

FOMO’s real trigger

People use the term FOMO in many different ways. In our research, we focus on a very specific type of FOMO: the kind that occurs when people miss out on events that involve valued social connections.

With this kind of FOMO, we found that the pain of missing out is not related to missing the actual event or opportunity – although that could be there as well. The FOMO we study happens when people miss the chance to bond with friends, co-workers or teammates they care about.

So, the critical part of FOMO is missing out on interactions with people you value. FOMO about a group dinner at a restaurant isn’t really about the food and great lighting. Nor is FOMO about a concert just about the band’s performance. Instead, it’s about the lost opportunity to connect and make memories with people who are important to you.

Why is this upsetting? Imagine the scenario where all your best friends go out to dinner without you. They bond and make lasting memories with each other – and you’re not there for any of it.

If they get closer to each other, where does that leave you? What happens to your social relationships and your sense of belonging? Do you become a less important friend? Less worthy of future invites? Or even kicked out of the group altogether? The anxiety of FOMO can begin to spiral.

People with what psychologists call an anxious attachment style chronically fear rejection and isolation from others. Because FOMO involves anxiety about future social belonging, it may not come as a surprise that people who are naturally more anxious about their friendships tend to get more intense FOMO. When we asked people in one of our studies to scroll social media until they encountered something social they missed, we found that the more anxiously attached a participant was, the more intense FOMO they experienced.

cheerful group laughing together around an outdoor dining table
They’ll always remember that summer cookout – and you weren’t there.
Maskot/DigitalVision via Getty Images

Not just missing Coachella

Getting FOMO for an amazing event you can’t attend makes sense. But if FOMO is less about the event itself and more about the social bonding, what happens when you miss something that’s not really fun at all?

We find that people anticipate FOMO even for unenjoyable missed events. As long as there is some form of missed social bonding, feelings of FOMO emerge. One of our studies found that people anticipated more FOMO from missing an un-fun event that their friends would be at, than a fun event without their friends.

For better or for worse, sad and stressful events can often be emotionally bonding: Going to a funeral to support a friend, cleaning up the mess after a party, or even white-knuckling through a harrowing initiation ceremony can all offer opportunities to forge stronger connections with one another. Stressful contexts like these can be fertile grounds for FOMO.

How to fend off FOMO

Popular discussions about the negative consequences of FOMO tend to focus on the FOMO people feel from compulsively scrolling on social media and seeing what they missed out on. Consequently, much of the suggested advice on how to mitigate FOMO centers on turning off phones or taking a vacation from social media.

Those recommendations may be tough for many people to execute. Plus, they address the symptoms of FOMO, not the cause.

Our finding that the core of FOMO is anxiety about missed social relationships yields a simpler strategy to combat it: Reminding yourself of the last time you connected with close friends may provide a sense of security that staves off feelings of FOMO.

In an experiment testing multiple interventions, we asked 788 study participants to look through their social media feeds until they encountered a post of a missed social event. We asked about 200 of these participants to immediately rate how much FOMO they were feeling. They averaged a 3.2 on a 1-to-7 scale.

Another group of about 200 participants also scrolled through their social media feeds until they encountered a post of a missed social event. But before indicating how much FOMO they were feeling, we asked them to think back to a prior experience socializing and bonding with their friends. Encouragingly, this reflection exercise seemed to curtail FOMO. Their average FOMO rating was 2.7 out of 7, a significant drop.

group of older women on a neighborhood walk with a leashed dog
Reminding yourself about other good times with your pals can help keep FOMO at bay.
AJ_Watt/E+ via Getty Images

With the remaining participants, we tested other strategies for mitigating FOMO – thinking about the next time they might see their friends or imagining what they’d say to a FOMO-suffering friend – but the simple reflection exercise was by far the most promising.

So, reminding yourself of the meaningful relationships you already have and reaffirming your social belonging in the moment may help combat the rush of anxiety that is characteristic of FOMO.

And missing out on social bonding experiences doesn’t have to be anxiety-provoking. In fact, in our activity-packed, hectic lives, missing some “must-attend” events may be a welcome relief – especially if you remind yourself that your social belonging is not in jeopardy. Cue a recent wave of counter-FOMO programming called JOMO, or “Joy of Missing Out.”

To quote Stuart Smalley, the fictional self-help guru of 1990s “Saturday Night Live,” reminding yourself that “I’m good enough, I’m smart enough, and doggone it, people like me!” might be just the trick to mitigate FOMO.The Conversation

Jacqueline Rifkin, Assistant Professor of Marketing, Cornell University; Barbara Kahn, Patty and Jay H. Baker Professor of Marketing, University of Pennsylvania, and Cindy Chan, Assistant Professor of Marketing, University of Toronto

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Read More

The post Feeling FOMO for something that’s not even fun? It’s not the event you’re missing, it’s the bonding appeared first on theconversation.com

Continue Reading

Trending