williampietri's blurblog

Late Night Open Thread: On Doomerism by Anne Laurie
Wednesday January 14^th, 2026 at 12:45 PM

Balloon Juice

This is a petty issue, not an important one: a trifle, comparatively.
If you think we’re doomed, you’re free to think so. If you think the law and protest and politics are pointless, likewise. If you think methods and issues people are talking about are irrelevant, that’s for you to think.
/1

— Fucking Bitch Hat (@kenwhite.bsky.social) January 9, 2026 at 1:07 PM

/2 But if, when folks are risking their lives to oppose ICE, and people are talking about how to best defend them and how we might support each other and do right, the role YOU are called to is to slump in and tell us what we are talking about is naive and meaningless, you can just fuck right off.

/3 That sort of performative emo narcissism is a personality between you and your therapist and the Lord your God and get it fuck off my thread, I am blocking your tedious ass.

Our ancestors were real breathing people who turned sticks into fire, their descendants spent thousands of years building a regime of unsurpassed global peace and prosperity, and we shall not entertain the suggestion that it all ends here just because the man with the golden toilet says so.

— Michael Engard (@engard.me) January 9, 2026 at 1:10 PM

It's totally understandable if you're dooming about any facet of the American experiment right now. So your feelings are "valid" in the sense that they represent real anxiety, and I get that. But to vent that anxiety in other people's spaces is wrong for three reasons.

— Ken Jennings (@kenjennings.bsky.social) January 7, 2026 at 10:39 PM

First, it’s factually wrong. There will be elections in 2026 and 2028 under Trump, just like there were elections last year under Trump and during his first term. This despite one of the two major parties now harboring a lot of anti-democratic elements and ideas.

I’m not particularly interested in convincing anyone on this point and won’t try, the future is the future. But if the left side of the political spectrum is still the domain of scholarship and expertise, take note that you don’t find scholars and experts you worrying about canceled US elections.

Second, and probably most importantly, it’s tactically wrong. “No point discussing political opposition to fascism, there won’t be elections anyway” cedes victory to your enemies. It’s defeatism and nihilism.

Finally, it’s wrong AS A MATTER OF ETIQUETTE. Entering a total stranger’s discussion and leading with your private anxiety is as off-putting in social media replies as it would be in real life. If you wouldn’t interrupt a stranger at a party to announce that America is doomed, don’t do it here.

If you are anxious and sad about the state of the world, that’s fine, and there are plenty of strategies for dealing with that. But I think you already know that drive-by online dooming isn’t a strategy. It’s selfish and adolescent. It’s a contagion that only spreads the worst of you, not the best.

Take a second and think before posting the easy Eeyore reply. You might have something substantive to say instead. Or, even better, you can say nothing at all.

The post Late Night Open Thread: On Doomerism appeared first on Balloon Juice.

Read the whole story

williampietri

11 days ago

reply

ICE Tells North Side Alderperson To Back Off As She Warns Neighbors About Immigration Agents by Colin Boyle and Alex V. Hernandez
Wednesday October 22^nd, 2025 at 9:10 AM

Block Club Chicago

This is part of our series of daily recaps of ICE activity in the Chicago region. Have a tip we should check out? Email newsroom@blockclubchi.org.

ALBANY PARK — Ald. Rossana Rodriguez-Sanchez (33rd) is speaking out after federal immigration authorities led her into an alley and warned her that she was “impeding” them in Albany Park Tuesday morning, she told Block Club.

Listen 🎧

Rodriguez-Sanchez was in a car following agents driving in a black SUV near the intersection of Avers and Leland avenues around 10:35 a.m. Tuesday. The alderwoman was in her car with her Chief Of Staff Veronica Tirado-Mercado honking horns and blowing whistles to alert neighbors to the presence of federal agents, she said.

“They went into an alley because they wanted to get us secluded,” Rodriguez-Sanchez said. “And I followed them in, and that’s when they came out of the car and gave me a warning. Afterwards, we kept following them. But we lost them at some point going south on Pulaski.”

Rodriguez-Sanchez, whose ward includes parts of Albany Park, Irving Park, Avondale and Ravenswood Manor, said three federal agents wearing camouflage uniforms and masks approached her. The incident was also recorded.

One of the federal agents tells Rodriguez-Sanchez that “911 is going to be notified,” according to the video.

“I’m giving you your warning, okay?,” the agent said, according to the video.

When Rodriguez-Sanchez asks what the warning is for, the agent said “impeding in our operation, ma’am.”

The federal agent doesn’t give a name, but leans slightly forward to briefly show a patch on the left shoulder of his uniform that appears to identify him as “E2-20.”

Rodriguez-Sanchez and Tirado-Mercado continue to ask the federal agent his name, according to the video.

Instead of sharing his name, the agent reiterates that he is giving Rodriguez-Sanchez a “warning” before he and the two other agents walk back to their SUV, according to the video.

It was the first of two incidents Tuesday where elected officials were approached by federal agents while trailing them to alert neighbors.

Around 11:30 a.m. Tuesday, state Rep. Hoan Huynh said he and a staff member were following an ICE car near Montrose and Kimball avenues when agents blocked his car and one approached the car with their gun drawn. The agent pressed the gun against the car window tried to smash the window, Huynh said in a statement.

“If they can pull a gun on an elected official and try to bash in my window, there’s no end to the terror they will continue reigning on our communities,” Huynh said in a statement sent by his Congressional campaign. “We must fight back against this fascist regime that has no place in America.”

A Homeland Security spokesperson did not immediately answer questions about the interactions.

A Chevy Tahoe with a California license plate is driven by federal agents on Whipple Street in Albany Park after detaining a person on Oct. 21, 2025. Credit: Provided

Earlier, when Rodriguez-Sanchez began following the federal agents in the SUV, she saw them reach for their masks before they pulled into the alley, she said.

“It was kind of nerve-racking. Because these are heavily armed men, who are in masks and in military uniform. I am there alone with my chief of staff. We were not armed. We don’t have any way to defend ourselves from them,” Rodriguez-Sanchez said. “But I think that this is a moment where we’re just gonna have to do what we have to do to keep people safe.”

Tuesday afternoon, the alderwoman was also notified of another incident of two Polish contractors being detained by federal agents near the intersection of Whipple Street and Montrose Avenue, she said.

“Fear cannot paralyze us, right? Yes, we’re going to be scared. There are going to be moments when we’re scared. But we cannot allow that to freeze us, or lead us to despair. Or to lead us to inaction. This is the moment when we need all hands on deck,” Rodriguez-Sanchez said.

Ald. Rossana Rodriguez Sanchez (33rd) takes a call while watching for federal agents outside of Hibbard Elementary School in Albany Park on Oct. 21, 2025.

Agents Active On North Side Tuesday

Rodriguez-Sanchez was slated to be at City Hall Tuesday, but as soon as she heard reports of federal immigration agents in Albany Park, Edgewater and Rogers Park she decided to head to her ward to try to warn neighbors, she said. Earlier this month, federal immigration agents deployed tear gas in Albany Park to chase off neighbors who successfully stopped them from detaining a neighbor.

“They are definitely hitting the North Side today, and particularly neighborhoods that are heavily immigrant populated,” Rodriguez-Sanchez said.

Listen 🎧

Block Club reached out to several Albany Park organizers who were aware of agents in the area, and were out monitoring school drop-off and dismissals with whistles, know your rights information and cell phones in hand to help protect students, parents and other community members from federal agents.

Rodriguez-Sanchez urged local businesses to post visible signs letting federal immigration agents know their spaces are private property and that they are not welcome there, she said.

“We have those available at our office, 4747 N. Sawyer Ave. We also did a push today and try to get those out so business can have them,” Rodriguez-Sanchez said. “We want everybody to stay vigilant and to do whatever they can to protect their neighbors.”

Members of the Texas National Guard walk around at the Joliet Local Training Area in Elwood on Tuesday, Oct. 7, 2025. Credit: Talia Sprague/Block Club Chicago

National Guard Block Extended As Trump Lawyers Eye Supreme Court

Trump administration attorneys agreed Tuesday to a 30-day extension on a temporary restraining order that has so far blocked the National Guard from being deployed to Chicago.

The extension through Nov. 24 takes some wind out the sails of a Wednesday hearing where U.S. District Judge April Perry was expected to rule on whether the two-week restraining order would continue as the lawsuit brought by the state of Illinois against the Trump administration continues to play out.

Both parties spoke this week and mutually agreed to the extension on the condition the agreement would not influence appeals to higher courts, according to court records.

Last week, the Trump administration appealed the case to the U.S. Supreme Court, setting the stage for a potential watershed ruling on the reaches of presidential power.

State attorneys filed a response to the Supreme Court Monday, arguing there is “no rebellion or danger of rebellion” giving Trump grounds to federalize the National Guard to quell protesters of his immigration “blitz” in Chicago.

Mother Of ‘Face of Operation Midway Blitz’ Speaks Out

Last month, the Department of Homeland Security announced it would ramp up immigration enforcement in Chicago under “Operation Midway Blitz.” The operation would be “in honor” of Katie Abraham, a 20-year-old from suburban Glenview who was killed in a January hit-and-run crash by a Guatemalan man also charged with faking personal documents, officials announced.

Abraham’s father and step-mom appeared in an August video produced by the Department of Homeland Security, blaming Democrats for sanctuary policies and people illegally entering the country.

But in an op-ed published by the Tribune on Tuesday, Abraham’s mother, Denise Lorence, wrote her daughter “would not have wanted” the federal operation as it has played out.

“Katie would not want to be associated with an operation in which kids witness their parents being taken into custody on their way to or from school. She wouldn’t support scaring kids with the use of military efforts in their neighborhoods or in their apartment buildings,” Lorence wrote.

She added her daughter was not outwardly political, “did not choose to be thrust into this political spotlight” or be used as a “political pawn.”

“A complex factor is that Katie’s father and his wife agreed to use Katie’s name in support of Operation Midway Blitz,” Lorence wrote. “I want to acknowledge the depths of her dad’s grief. I will never fault or question someone in the way they grieve.”

Scenes from a standoff at 105th and Avenue N on Chicago’s southeast side, between ICE and other federal agents and furious residents after a vehicle chase resulting in two cars blocking the intersect and residents gathered Credit: Matthew Kaplan/Block Club Chicago

Fundraiser For Woman Wrongfully Detained By Feds Raises Over $10,000

A Chicago family is raising funds for legal and medical needs after ICE agents hit Dayanne Figueroa’s car in the River West area earlier this month and aggressively pulled her from her car, detaining her without explanation, the family wrote on a GoFundMe.

The Oct. 10 incident, caught on video by several witnesses, has left Figueroa — a U.S. citizen — without a car and suffering from mental and physical trauma, her family said.

She was recovering from kidney surgery and had to make another hospital visit because of the excessive force used during her detention, her family said.

Figueroa is “unable to work for the time being, and is burdened with growing medical expenses,” her family wrote. “The good news is that she is home now and able to begin recovering with the support of family.”

The GoFund has raised over $10,000 of a $50,000 goal by Tuesday afternoon.

Happening In Chicago

ICE agents detained at least one person near Balmoral Avenue and Clark Street and at least one other near Kenmore and Glenlake avenues in Edgewater Tuesday morning, Ald. Leni Manaa-Hoppenworth (48th) said in a statement. “ICE agents have been active throughout the ward today,” she said.
Around 11 a.m. Tuesday, ICE detained at least one person near Lincoln and Warner avenues, Ald. Matt Martin (47th) said in a statement. “There have been multiple other sightings of ICE agents driving Jeep Wagoneers and other SUVs in the area this morning,” Martin said. “The vast majority of these sightings have occurred north of Irving Park [Road].”
There were multiple ICE sightings around Albany Park, with reports of federal agents attempting to arrest landscapers near Montrose and California avenues around 9:15 a.m. Tuesday, according to Northwest Side Rapid Response team.
Federal agents in six cars were seen near Laurence Ave and Sheridan Road in Uptown around 11 a.m. Tuesday, according to local nonprofit Asian Americans Advancing Justice.
“The Northside of Chicago is getting hit hard today, Albany Park and Uptown,” advocates with Organized Communities Against Deportations wrote on social media Tuesday. “Stay alert, use whistles if you have any. Alert your neighbors to stay inside if possible!”

-Mack Liederman and Ariel Parrella-Aureli contributed.

Support Local News!

Subscribe to Block Club Chicago, an independent, 501(c)(3), journalist-run newsroom. Every dime we make funds reporting from Chicago’s neighborhoods. Already subscribe? Click here to gift a subscription, or you can support Block Club with a tax-deductible donation.

Listen to the Block Club Chicago podcast:

Read the whole story

williampietri

95 days ago

reply

AI Isn't Always Intelligent, Like These 24 AI Fails
Thursday June 26^th, 2025 at 7:50 PM

Know Your Meme Editorials

Artificial intelligence and the use of AI for many things is a conversation occurring all over the world. People either love or hate AI, but everyone can agree that there are some things AI is not ready for. To some degree, people would expect AI to be capable of answering simple, straightforward questions quickly. However, the program is learning, so it is very common for AI programs to get extremely straightforward questions and inquiries completely wrong, which is hilarious.

From simple math to well-known answers to questions about TV shows, AI has been caught making some pretty big oversights in many regions and subjects. Who knows if AI will ever be totally reliable, but for now, people need to use caution. Everyone can find the humor in the mistakes AI makes, especially because of how smart some folks claim it to be.

Here, we have collected 24 times someone asked an AI program a question only to have it pop out something so shocking, incorrect, or just plain strange in return. Enjoy these "smart" computers making dumb mistakes.

Killing it

Is Google evil?

This seems… wrong

Oh… oh no…

No one needed this

Understood

Totally safe

Thanks, Google

Very experimental

I don't know what to say

This is so funny

Wait a second…

It sure is

Cool, AI

Good to know

Ohhh…

Hmmm…

AI is the future

Screenshot in 2025

Where is mama?

Ya don't say…

Wonderful

You don't say

Tasty

Read the whole story

williampietri

213 days ago

reply

Inside the AI Prompts DOGE Used to “Munch” Contracts Related to Veterans’ Health by by Brandon Roberts and Vernal Coleman
Sunday June 8^th, 2025 at 8:50 AM

ProPublica

by Brandon Roberts and Vernal Coleman

ProPublica is a nonprofit newsroom that investigates abuses of power. Sign up to receive our biggest stories as soon as they’re published.

When an AI script written by a Department of Government Efficiency employee came across a contract for internet service, it flagged it as cancelable. Not because it was waste, fraud or abuse — the Department of Veterans Affairs needs internet connectivity after all — but because the model was given unclear and conflicting instructions.

Sahil Lavingia, who wrote the code, told it to cancel, or in his words “munch,” anything that wasn’t “directly supporting patient care.” Unfortunately, neither Lavingia nor the model had the knowledge required to make such determinations.

Sahil Lavingia at his office in Brooklyn (Ben Sklar for ProPublica)

“I think that mistakes were made,” said Lavingia, who worked at DOGE for nearly two months, in an interview with ProPublica. “I’m sure mistakes were made. Mistakes are always made.”

It turns out, a lot of mistakes were made as DOGE and the VA rushed to implement President Donald Trump’s February executive order mandating all of the VA’s contracts be reviewed within 30 days.

ProPublica obtained the code and prompts — the instructions given to the AI model — used to review the contracts and interviewed Lavingia and experts in both AI and government procurement. We are publishing an analysis of those prompts to help the public understand how this technology is being deployed in the federal government.

The experts found numerous and troubling flaws: the code relied on older, general-purpose models not suited for the task; the model hallucinated contract amounts, deciding around 1,100 of the agreements were each worth $34 million when they were sometimes worth thousands; and the AI did not analyze the entire text of contracts. Most experts said that, in addition to the technical issues, using off-the-shelf AI models for the task — with little context on how the VA works — should have been a nonstarter.

Lavingia, a software engineer enlisted by DOGE, acknowledged there were flaws in what he created and blamed, in part, a lack of time and proper tools. He also stressed that he knew his list of what he called “MUNCHABLE” contracts would be vetted by others before a final decision was made.

Portions of the prompt are pasted below along with commentary from experts we interviewed. Lavingia published a complete version of it on his personal GitHub account.

Problems with how the model was constructed can be detected from the very opening lines of code, where the DOGE employee instructs the model how to behave:

You are an AI assistant that analyzes government contracts. Always provide comprehensive few-sentence descriptions that explain WHO the contract is with, WHAT specific services/products are provided, and WHO benefits from these services. Remember that contracts for EMR systems and healthcare IT infrastructure directly supporting patient care should be classified as NOT munchable. Contracts related to diversity, equity, and inclusion (DEI) initiatives or services that could be easily handled by in-house W2 employees should be classified as MUNCHABLE. Consider 'soft services' like healthcare technology management, data management, administrative consulting, portfolio management, case management, and product catalog management as MUNCHABLE. For contract modifications, mark the munchable status as 'N/A'. For IDIQ contracts, be more aggressive about termination unless they are for core medical services or benefits processing.

This part of the prompt, known as a system prompt, is intended to shape the overall behavior of the large language model, or LLM, the technology behind AI bots like ChatGPT. In this case, it was used before both steps of the process: first, before Lavingia used it to obtain information like contract amounts; then, before determining if a contract should be canceled.

Including information not related to the task at hand can confuse AI. At this point, it’s only being asked to gather information from the text of the contract. Everything related to “munchable status,” “soft-services” or “DEI” is irrelevant. Experts told ProPublica that trying to fix issues by adding more instructions can actually have the opposite effect — especially if they’re irrelevant.

Analyze the following contract text and extract the basic information below. If you can't find specific information, write "Not found".

CONTRACT TEXT: {text[:10000]} # Using first 10000 chars to stay within token limits

The models were only shown the first 10,000 characters from each document, or approximately 2,500 words. Experts were confused by this, noting that OpenAI models support inputs over 50 times that size. Lavingia said that he had to use an older AI model that the VA had already signed a contract for.

Please extract the following information: 1. Contract Number/PIID 2. Parent Contract Number (if this is a child contract) 3. Contract Description - IMPORTANT: Provide a DETAILED 1-2 sentence description that clearly explains what the contract is for. Include WHO the vendor is, WHAT specific products or services they provide, and WHO the end recipients or beneficiaries are. For example, instead of "Custom powered wheelchair", write "Contract with XYZ Medical Equipment Provider to supply custom-powered wheelchairs and related maintenance services to veteran patients at VA medical centers." 4. Vendor Name 5. Total Contract Value (in USD) 6. FY 25 Value (in USD) 7. Remaining Obligations (in USD) 8. Contracting Officer Name 9. Is this an IDIQ contract? (true/false) 10. Is this a modification? (true/false)

This portion of the prompt instructs the AI to extract the contract number and other key details of a contract, such as the “total contract value.”

This was error-prone and not necessary, as accurate contract information can already be found in publicly available databases like USASpending. In some cases, this led to the AI system being given an outdated version of a contract, which led to it reporting a misleadingly large contract amount. In other cases, the model mistakenly pulled an irrelevant number from the page instead of the contract value.

“They are looking for information where it’s easy to get, rather than where it’s correct,” said Waldo Jaquith, a former Obama appointee who oversaw IT contracting at the Treasury Department. “This is the lazy approach to gathering the information that they want. It’s faster, but it’s less accurate.”

Lavingia acknowledged that this approach led to errors but said that those errors were later corrected by VA staff.

Once the program extracted this information, it ran a second pass to determine if the contract was “munchable.”

Based on the following contract information, determine if this contract is "munchable" based on these criteria:

CONTRACT INFORMATION: {text[:10000]} # Using first 10000 chars to stay within token limits

Again, only the first 10,000 characters were shown to the model. As a result, the munchable determination was based purely on the first few pages of the contract document.

Then, evaluate if this contract is "munchable" based on these criteria: - If this is a contract modification, mark it as "N/A" for munchable status - If this is an IDIQ contract: * For medical devices/equipment: NOT MUNCHABLE * For recruiting/staffing: MUNCHABLE * For other services: Consider termination if not core medical/benefits - Level 0: Direct patient care (e.g., bedside nurse) - NOT MUNCHABLE - Level 1: Necessary consultants that can't be insourced - NOT MUNCHABLE

The above prompt section is the first set of instructions telling the AI how to flag contracts. The prompt provides little explanation of what it’s looking for, failing to define what qualifies as “core medical/benefits” and lacking information about what a “necessary consultant” is.

For the types of models the DOGE analysis used, including all the necessary information to make an accurate determination is critical.

Cary Coglianese, a University of Pennsylvania professor who studies the governmental use of artificial intelligence, said that knowing which jobs could be done in-house “calls for a very sophisticated understanding of medical care, of institutional management, of availability of human resources” that the model does not have.

- Contracts related to "diversity, equity, and inclusion" (DEI) initiatives - MUNCHABLE

The prompt above tries to implement a fundamental policy of the Trump administration: killing all DEI programs. But the prompt fails to include a definition of what DEI is, leaving the model to decide.

Despite the instruction to cancel DEI-related contracts, very few were flagged for this reason. Procurement experts noted that it’s very unlikely for information like this to be found in the first few pages of a contract.

- Level 2+: Multiple layers removed from veterans care - MUNCHABLE - Services that could easily be replaced by in-house W2 employees - MUNCHABLE

These two lines — which experts say were poorly defined — carried the most weight in the DOGE analysis. The response from the AI frequently cited these reasons as the justification for munchability. Nearly every justification included a form of the phrase “direct patient care,” and in a third of cases the model flagged contracts because it stated the services could be handled in-house.

The poorly defined requirements led to several contracts for VA office internet services being flagged for cancellation. In one justification, the model had this to say:

The contract provides data services for internet connectivity, which is an IT infrastructure service that is multiple layers removed from direct clinical patient care and could likely be performed in-house, making it classified as munchable.

IMPORTANT EXCEPTIONS - These are NOT MUNCHABLE: - Third-party financial audits and compliance reviews - Medical equipment audits and certifications (e.g., MRI, CT scan, nuclear medicine equipment) - Nuclear physics and radiation safety audits for medical equipment - Medical device safety and compliance audits - Healthcare facility accreditation reviews - Clinical trial audits and monitoring - Medical billing and coding compliance audits - Healthcare fraud and abuse investigations - Medical records privacy and security audits - Healthcare quality assurance reviews - Community Living Center (CLC) surveys and inspections - State Veterans Home surveys and inspections - Long-term care facility quality surveys - Nursing home resident safety and care quality reviews - Assisted living facility compliance surveys - Veteran housing quality and safety inspections - Residential care facility accreditation reviews

Despite these instructions, AI flagged many audit- and compliance-related contracts as “munchable,” labeling them as “soft services.”

In one case, the model even acknowledged the importance of compliance while flagging a contract for cancellation, stating: “Although essential to ensuring accurate medical records and billing, these services are an administrative support function (a ‘soft service’) rather than direct patient care.”

Key considerations: - Direct patient care involves: physical examinations, medical procedures, medication administration - Distinguish between medical/clinical and psychosocial support

Shobita Parthasarathy, professor of public policy and director of the Science, Technology, and Public Policy Program at University of Michigan, told ProPublica that this piece of the prompt was notable in that it instructs the model to “distinguish” between the two types of services without instructing the model what to save and what to kill.

The emphasis on “direct patient care” is reflected in how often the AI cited it in its recommendations, even when the model did not have any information about a contract. In one instance where it labeled every field “not found,” it still decided the contract was munchable. It gave this reason:

Without evidence that it involves essential medical procedures or direct clinical support, and assuming the contract is for administrative or related support services, it meets the criteria for being classified as munchable.

In reality, this contract was for the preventative maintenance of important safety devices known as ceiling lifts at VA medical centers, including three sites in Maryland. The contract itself stated:

Ceiling Lifts are used by employees to reposition patients during their care. They are critical safety devices for employees and patients, and must be maintained and inspected appropriately.

Specific services that should be classified as MUNCHABLE (these are "soft services" or consulting-type services): - Healthcare technology management (HTM) services - Data Commons Software as a Service (SaaS) - Administrative management and consulting services - Data management and analytics services - Product catalog or listing management - Planning and transition support services - Portfolio management services - Operational management review - Technology guides and alerts services - Case management administrative services - Case abstracts, casefinding, follow-up services - Enterprise-level portfolio management - Support for specific initiatives (like PACT Act) - Administrative updates to product information - Research data management platforms or repositories - Drug/pharmaceutical lifecycle management and pricing analysis - Backup Contracting Officer's Representatives (CORs) or administrative oversight roles - Modernization and renovation extensions not directly tied to patient care - DEI (Diversity, Equity, Inclusion) initiatives - Climate & Sustainability programs - Consulting & Research Services - Non-Performing/Non-Essential Contracts - Recruitment Services

This portion of the prompt attempts to define “soft services.” It uses many highly specific examples but also throws in vague categories without definitions like “non-performing/non-essential contracts.”

Experts said that in order for a model to properly determine this, it would need to be given information about the essential activities and what’s required to support them.

Important clarifications based on past analysis errors: 2. Lifecycle management of drugs/pharmaceuticals IS MUNCHABLE (different from direct supply) 3. Backup administrative roles (like alternate CORs) ARE MUNCHABLE as they create duplicative work 4. Contract extensions for renovations/modernization ARE MUNCHABLE unless directly tied to patient care

This section of the prompt was the result of analysis by Lavingia and other DOGE staff, Lavingia explained. “This is probably from a session where I ran a prior version of the script that most likely a DOGE person was like, ‘It’s not being aggressive enough.’ I don’t know why it starts with a 2. I guess I disagreed with one of them, and so we only put 2, 3 and 4 here.”

Notably, our review found that the only clarifications related to past errors were related to scenarios where the model wasn’t flagging enough contracts for cancellation.

Direct patient care that is NOT MUNCHABLE includes: - Conducting physical examinations - Administering medications and treatments - Performing medical procedures and interventions - Monitoring and assessing patient responses - Supply of actual medical products (pharmaceuticals, medical equipment) - Maintenance of critical medical equipment - Custom medical devices (wheelchairs, prosthetics) - Essential therapeutic services with proven efficacy

For maintenance contracts, consider whether pricing appears reasonable. If maintenance costs seem excessive, flag them as potentially over-priced despite being necessary.

This section of the prompt provides the most detail about what constitutes “direct patient care.” While it does cover many aspects of care, it still leaves a lot of ambiguity and forces the model to make its own judgements about what constitutes “proven efficacy” and “critical” medical equipment.

In addition to the limited information given on what constitutes direct patient care, there is no information about how to determine if a price is “reasonable,” especially since the LLM only sees the first few pages of the document. The models lack knowledge about what’s normal for government contracts.

“I just do not understand how it would be possible. This is hard for a human to figure out,” Jaquith said about whether AI could accurately determine if a contract was reasonably priced. “I don’t see any way that an LLM could know this without a lot of really specialized training.”

Services that can be easily insourced (MUNCHABLE): - Video production and multimedia services - Customer support/call centers - PowerPoint/presentation creation - Recruiting and outreach services - Public affairs and communications - Administrative support - Basic IT support (non-specialized) - Content creation and writing - Training services (non-specialized) - Event planning and coordination

This section explicitly lists which tasks could be “easily insourced” by VA staff, and more than 500 different contracts were flagged as “munchable” for this reason.

“A larger issue with all of this is there seems to be an assumption here that contracts are almost inherently wasteful,” Coglianese said when shown this section of the prompt. “Other services, like the kinds that are here, are cheaper to contract for. In fact, these are exactly the sorts of things that we would not want to treat as ‘munchable.’” He went on to explain that insourcing some of these tasks could also “siphon human sources away from direct primary patient care.”

In an interview, Lavingia acknowledged some of these jobs might be better handled externally. “We don’t want to cut the ones that would make the VA less efficient or cause us to hire a bunch of people in-house,” Lavingia explained. “Which currently they can’t do because there’s a hiring freeze.”

The VA is standing behind its use of AI to examine contracts, calling it “a commonsense precedent.” And documents obtained by ProPublica suggest the VA is looking at additional ways AI can be deployed. A March email from a top VA official to DOGE stated:

Today, VA receives over 2 million disability claims per year, and the average time for a decision is 130 days. We believe that key technical improvements (including AI and other automation), combined with Veteran-first process/culture changes pushed from our Secretary’s office could dramatically improve this. A small existing pilot in this space has resulted in 3% of recent claims being processed in less than 30 days. Our mission is to figure out how to grow from 3% to 30% and then upwards such that only the most complex claims take more than a few days.

If you have any information about the misuse or abuse of AI within government agencies, reach out to us via our Signal or SecureDrop channels.

If you’d like to talk to someone specific, Brandon Roberts is an investigative journalist on the news applications team and has a wealth of experience using and dissecting artificial intelligence. He can be reached on Signal @brandonrobertz.01 or by email brandon.roberts@propublica.org.

Read the whole story

williampietri

231 days ago

reply

isn’t it crazy that a woman being gender nonconforming literally just requires her to exist in her…
Friday May 30^th, 2025 at 9:06 PM

Latining

hot-on-my-watch:

hot-on-my-watch:
tannisroute:
tannisroute:
isn’t it crazy that a woman being gender nonconforming literally just requires her to exist in her own body without making any changes whatsoever. why does the fact that i don’t wear makeup and i don’t shave and i don’t wear a bra have to be some political act. why can’t i just fucking exist
it is Exactly this kind of thinking that inspired this post lol
Alright Judith Butler, bit early in the day to be proving so conclusively that gender is at least in part a social construct isn’t it? 😅😅😅
And the autistic in me wants to make sure you know what “conformity” is.
But yes, strong agree.
Recently I realised that while me and various other women I know would be quietly delighted to be lazered up and so never again grow hair on our armpits, genitals or legs, my husband, who has never had a beard or moustache and does not intend to, would not make the same choice for his face. He was actually very surprised at me. Then I saw a post on Reddit where some man called his girlfriend a whore for having had the same body hair lazered off. Different worlds!
And I say this as someone who has had all their natural body hair for years now- disability baby!
To clarify and apologise because I misspoke, @tannisroute makes an extremely good point about the nature of gender conformity for pubescent girls and women, certainly in the West. Men express physical gender conformity by leaving their body much as it is, whereas women can only do it by actively altering ours in never-ending processes that consume much more time, energy and expense. As OP said:

Man trims only hair on head: conformity
Women trims only hair on head: non-conformity.

Man wears no makeup: conformity
Woman wears no makeup: non-conformity

For men, gender conformity is more often a lack of action, where for us it is action itself.

Read the whole story

williampietri

240 days ago

reply

Chasing the Electric Pangolin Open Thread
Sunday May 18^th, 2025 at 6:43 PM

The BS Detector

A few months ago, I remember reading some press about a new economics preprint out of MIT. The Wall Street Journal covered the research a few days after it dropped online, with the favorable headline, “Will AI Help or Hurt Workers? One 26-Year-Old Found an Unexpected Answer.” The photo for the article shows the promising young author, Aidan Toner-Rodgers, standing next to two titans of economics research, Daron Acemoglu (2024 Nobel laureate in economics) and David Autor.

“It’s fantastic,” said Acemoglu.
“I was floored,” said Autor.

The Atlantic and Nature covered the research as well, with both publications seemingly stunned by the quality of the work. And indeed, the quality of the work was stunningly high! The article analyzes data from a randomized trial of over one thousand materials researchers at the R&D lab of a US-based firm who were given access to AI tools. Toner-Rodgers adeptly tracks the effect of access to these AI tools on:

The number of materials discovered by the researchers.
The number of patents filed on those new materials.
The number of new product prototypes developed based on those new materials.
The time-allocation of the researchers over time, split between experimentation, judgment, and ideation.
The sentiment towards AI of the researchers, before and after AI tool adoption.

Not only do each of these metrics show really clear effects, but Toner-Rodgers throws every tool in the book at exploring them, using a number of really sophisticated methodologies that must have taken tremendous effort and care:

He calculates the quality of the new materials through a really elaborate algorithm that measures the distance from the “target” properties for each material discovered.
He measures the structural similarity of the crystal structures of the new materials to current materials by calculating the difference in atomic positions. This is really hard to do, even for materials scientists, let alone for economists!
He determines the novelty of patents using bigram analysis.
He uses a large language model (Claude 3.5) for the automated classification of research tasks.

At the time I saw the press coverage, I didn’t bother to click on the actual preprint and read the work. The results seemed unsurprising: when researchers were given access to AI tools, they became more productive. That sounds reasonable and expected.

Toner-Rodgers submitted his paper to The Quarterly Journal of Economics, the top econ journal in the world. His website said that he had received a “revise and resubmit” already, meaning that the article was probably well on its way to being published.

Unfortunately for everyone involved, the work is entirely fraudulent. MIT put out a press release this morning stating that they had conducted an internal, confidential review and that they have “no confidence in the veracity of the research contained in the paper.” The WSJ has covered this development as well. The econ department at MIT sent out an internal email so direly-worded on the matter that on first glance, students reading the email had assumed someone had died.

In retrospect, there had been omens and portents. I wish I had read the article at the time of publication, because I suspect my BS detector would have risen to an 11 out of 10 if I’d given it a close read. It really is the perfect subject for this blog: a fraudulent preprint called “Artificial Intelligence, Scientific Discovery, and Product Innovation,” with a focus on materials science research.

Hindsight is of course 20/20, but the first red flag that should have been raised is the source of the data itself. The article gives enough details to raise some intense curiosity. It’s a US-based firm that has (at least) 1,018 researchers devoted to materials discovery alone, an enormous amount. This narrows it down to a handful of firms. Initially the companies Apple, Intel, and 3M came to mind, but then I noticed this breakdown of the materials specialization of the researchers in the study:

This was bizarre to me, as very few companies do massive amounts of materials research and which also is split fairly evenly across the spectrum of materials, in disparate domains such as biomaterials and metal alloys. I did some “deep research” to confirm this hypothesis (thank you ChatGPT and Gemini) and I believe that there are a few companies that could plausibly meet this description: 3M, Dupont, Dow, and Corning. None of these are perfect fits, either, especially with the 32% share on metals and alloys.

I’ll really be embarrassing myself if it turns out that an actual R&D lab was supplying Toner-Rodgers with data and he was just fraudulently manipulating it, but I think this is quite unlikely, and it’s more plausible that the data was entirely fabricated to begin with. I have several reasons for believing this:

Why would a large company like this take such pains to run a randomized trial on its own employees, tracking a number of metrics of their performance, only to anonymously give this data to a single researcher from MIT—a first year PhD student, mind you—rather than publishing the findings themselves?
Even at those large R&D companies, only a small fraction of researchers are devoted to the task of “materials discovery,” and it seems implausible that a company would run an experiment on AI adoption on over a thousand employees in such a structured manner.
The description of the tasks that these employees do and the divisions between fields, and all the other information provided seems almost too neat to be true. Real companies don’t have hundreds of R&D teams each working on similar tasks, all of a similar size, all tracking the same metrics. It reads like how an economics student at MIT imagines R&D labs to be run if their only experience with such labs are from reading the top 1% of economics papers on innovation in research.

The next red flag should have been how spotless the findings were. In every domain that was explored, there was a fairly unambiguous result. New materials? Up by 44% (p<0.000). New patents? Up by 39% (p<0.000). New prototypes? Up by 17% (p<0.001).

The quality of the new materials? Up, and statistically significant. The novelty of the new materials? Up, and statistically significant. Did researchers who were previously more talented improve more from AI tool use? Yes. Were these results reflected in researchers self-assessments of their time allocation? Unambiguously yes. The plot for that last bit is every economist’s dream, a perfect encapsulation of the principle of comparative advantage taking effect:

And look how contrived and neat this other plot looks, showing whether researchers’ self-assessment of their judgment ability correlates with their survey response on the role of different domains of knowledge in AI materials discovery. Three out of four categories show a neat increase and one out of four remains constant (which is the one that from first principles seems like it wouldn’t matter, experience using other AI-evaluation tools).

This plot also makes no sense, when you think about it. Why would researchers with better judgment be systematically more likely to give higher numbers on this survey question on average?

Q3: On a scale of 1–10, how useful are each of the following in evaluating AI-suggested candidate materials (scientific training, experience with similar materials, intuition or gut feeling, and experience with similar tools)?

And then, to cap it off, here’s how Toner-Rodgers describes a fortuitous round of layoffs at the firm, that miraculously doesn’t interfere with the data collection for the primary analysis and yet contributes an insightful example that supports his findings:

“In the final month of my sample—excluded from the primary analysis—the firm restructured its research teams. The lab fired 3% of its researchers. At the same time, it more than offset these departures through increased hiring, expanding its workforce on net. While I do not observe the abilities of the new hires, those dismissed were significantly more likely to have weak judgment. Figure 13 shows the percent fired or reassigned by quartile of γˆ j. Scientists in the top three quartiles faced less than a 2% chance of being let go, while those in the bottom quartile had nearly a 10% chance.”

I mean, come on, be for real…

Share

Now, my background in materials science provides me a neat leg up, as I’d assume the vast majority of those reviewing/reading/following this paper are economists and people interested in the effects of AI use.

How do the parts of this paper that directly engage with materials science hold up? Well, they’re a little too clever. Take Toner-Rodgers’ analysis of “materials similarity” where he claimed to have used crystal structure calculations to determine how similar the new materials were to previously discovered materials. The plot is stunningly unambiguous, the new materials discovered with AI are more novel.

However, it boggles the mind that a random economics student at MIT would be able to easily (and without providing any further details), perform the highly sophisticated technique from the paper he cites (De et al, 2016), especially in this elegantly formalized manner without any domain expertise in computational materials research. This graph, and the data it represents, if true, would probably be worth a Nature paper on AI materials discovery on its own. In his paper, it’s relegated to the appendices.

This methodology also makes no sense at generalizing across different types of materials, so I have no clue how you could reduce the results from such broad classes of materials to a single figure of merit in this manner. The gaps between 0.0 and 0.2 and 0.8 and 1.0 might seem reasonable to someone who read a few papers and noticed similar gaps in a couple of the graphs, but it would be bizarre when generalized across several classes of materials, and the data is likely completely fabricated for this reason. To simplify this critique, a novel metal alloy would have a very different level of similarity from a reference class of previously-discovered alloys, than a novel polymer would from its own reference class. It would require some really sophisticated methodology to normalize this single figure of merit across material types, which Toner-Rodgers does not mention at all. Also, this would all be insanely challenging to implement using data from the Materials Project, requiring some sophisticated “big data” workflows. If you want a smoking gun, here’s a graph from a paper, Krieger et al, “Missing Novelty in Drug Development,” which Toner-Rodgers cites, using a similar methodology for drug discovery. It looks eerily similar to the distribution in this preprint. This distribution might make sense for drugs, but makes very little intuitive sense for a broad range of materials, with the figure of merit derived directly from the atomic positions in the crystal structure. This is the kind of mistake that someone with no domain expertise in materials science might make.

Toner-Rodgers’ treatment of “materials quality” would also probably drive a materials scientist insane if they were forced to think about it at length.

Here’s the equation he uses to calculate the “quality” of a new material:

This would likely be a case of extreme garbage in: garbage out. First of all, there are typically no “target features” that are easily reduced to single values, but also, even if there were, some of these would be distributed on a log scale, which would dramatically skew the values for certain classes of materials. Also, in general, the “quality” of a new material that an R&D lab develops is likely not at all related to improvements in the actual top-line figures of merit like “band gap” or “refractive index”, the two examples that Toner-Rodgers gives. Instead, they would be for things like durability, affordability, ease of manufacture, etc. These are all properties that are not easily reduced to a single value. And even if they were, good luck getting researchers to measure, systematize, and document these values for the new materials!

However, from this amalgam of gibberish, Toner-Rodgers manages to extract a significant finding anyway! All 1,018 scientists contribute to this endeavor, and statistically significant findings are reported in every single category:

Late Night Open Thread: On Doomerism by Anne Laurie Wednesday January 14th, 2026 at 12:45 PM

ICE Tells North Side Alderperson To Back Off As She Warns Neighbors About Immigration Agents by Colin Boyle and Alex V. Hernandez Wednesday October 22nd, 2025 at 9:10 AM

Listen 🎧

Agents Active On North Side Tuesday

Listen 🎧

National Guard Block Extended As Trump Lawyers Eye Supreme Court

Mother Of ‘Face of Operation Midway Blitz’ Speaks Out

Fundraiser For Woman Wrongfully Detained By Feds Raises Over $10,000

Happening In Chicago

AI Isn't Always Intelligent, Like These 24 AI Fails Thursday June 26th, 2025 at 7:50 PM

Killing it

Is Google evil?

This seems… wrong

Oh… oh no…

No one needed this

Understood

Totally safe

Thanks, Google

Very experimental

I don't know what to say

This is so funny

Wait a second…

It sure is

Cool, AI

Good to know

Ohhh…

Hmmm…

AI is the future

Screenshot in 2025

Where is mama?

Ya don't say…

Wonderful

You don't say

Tasty

Inside the AI Prompts DOGE Used to “Munch” Contracts Related to Veterans’ Health by by Brandon Roberts and Vernal Coleman Sunday June 8th, 2025 at 8:50 AM

isn’t it crazy that a woman being gender nonconforming literally just requires her to exist in her… Friday May 30th, 2025 at 9:06 PM

Chasing the Electric Pangolin Open Thread Sunday May 18th, 2025 at 6:43 PM

Late Night Open Thread: On Doomerism by Anne Laurie
Wednesday January 14^th, 2026 at 12:45 PM

ICE Tells North Side Alderperson To Back Off As She Warns Neighbors About Immigration Agents by Colin Boyle and Alex V. Hernandez
Wednesday October 22^nd, 2025 at 9:10 AM

AI Isn't Always Intelligent, Like These 24 AI Fails
Thursday June 26^th, 2025 at 7:50 PM

Inside the AI Prompts DOGE Used to “Munch” Contracts Related to Veterans’ Health by by Brandon Roberts and Vernal Coleman
Sunday June 8^th, 2025 at 8:50 AM

isn’t it crazy that a woman being gender nonconforming literally just requires her to exist in her…
Friday May 30^th, 2025 at 9:06 PM

Chasing the Electric Pangolin Open Thread
Sunday May 18^th, 2025 at 6:43 PM