What would it take to build AI ethically?

ladybugs@lemmy.world · edit-2 4 days ago

What would it take to build AI ethically?

Scott 🇨🇦🏴‍☠️@sh.itjust.works · 1 day ago

Pay people directly for their data for training. Not Reddit or Twitter, the people.

Don’t share any interactions with big tech, government, etc. Private and encrypted.

Run it locally.

YourMomsTrashman@lemmy.world · 2 days ago

Ethical “AI” exists. Upscaling models can be trained on just a couple thousand images. Handwriting recognition has been around for decades.

It’s not marketed as AI, because these are actual tools that people use, instead of … whatever LLMs are supposed to be.

technocrit@lemmy.dbzer0.com · 2 days ago

if there was a non-exploitative alternative.

LOL. We live under capitalism, homie.

dumnezero@piefed.social · 2 days ago

consent and not wasting resources

Hetare King@piefed.social · 4 days ago

I consider these to be the main ethical issues with specifically LLMs and generative AI in general:

Using people’s work as training data without consent.
The high cost of training a model meaning that only a few entities in the world can actually do so and so, only few people get to decide what the knowledge base and “slant” of the model is. This is true even for open source models.
The high resource cost of using a model relative to the value of its output.
People with malicious intent being empowered by it far, far more than anyone else.
The model producing the response to the query directly instead of leading to the source, leaving both the source without any way to benefit and the user from having any context queues they can use to verify the reliability of the information.
Infinite and automated production of misinformation, libel and psychological manipulation.
Inducing psychosis in people.

Point 1 can be resolved by the people training AI just making different choices. Many won’t unless they’re forced to, but in principle they could.

Points 2 and 3 could hypothetically be resolved in the future with better technology.

The rest are basically inherent to the technology and you can at best try and mostly fail to reduce the risk. So as far I’m concerned, what it would take to build AI ethically is to train it for very specific purposes and have it be used as statistical models by people who know what they’re doing.

Though I do see some potential for ethical LLMs by using them to perform vector searches instead of generating text, basically turning them into smarter search engines.

ladybugs@lemmy.world · 2 days ago

I agree. I think to get around 4-7, we’d need a completely different type of AI. We’d need something that isn’t an LLM, but that can do some/all of the legitimate things people are trying to use LLMs for.

The vector search thing is nice. I used to sometimes like the automated music recommendations I got on certain streaming services, which I’m guessing worked something like that.

valar@lemmy.ca · 4 days ago

Pay for all the content you train the model on

one_old_coder@piefed.social · 4 days ago

Or only use your own data, which would make it useless. I thought about RAGs but Wikipedia says:

These documents supplement information from the LLM’s pre-existing training data

It seems that RAGs use stolen data anyway.

And It still wouldn’t solve the issue that managers demand 10 times more work for free, it wouldn’t stop making workers crazy with the flood of reviews in programming, and execs wouldn’t stop dreaming of laying off everyone.

I Cast Fist@programming.dev · 3 days ago

Getting rid of all the tech bros and CEOs is the first step. That is an actually important step because they’re the ones that spend tons of money on lobbying for laws that are good for them. Laws can be (un)ethical, or abused in unethical ways (see DMCA and patent trolling). Remove the main pushers for unethical computer and IP related laws and you fix a significant part of the problem.

Yaky@slrpnk.net · 4 days ago

Do you mean any AI, or text-generating LLMs?

I am fairly certain Cornell built BirdNET / Merlin Bird ID song identification using recordings in the public domain or with permissive licenses.

Same goes for iNaturalist and Seek using volunteer-submitted and identified photos.

So it’s possible to built domain-specific models with fewer ethical issues, but the push is for bullshit generators, unfortunately.

ladybugs@lemmy.world · 2 days ago

Dang, I wish AI were just things like bird identification tools! That would be a much more wholesome world.

I might actually use and contribute to BirdNET. It looks like it helps with global biodiversity monitoring, which is awesome.

Unrefined@anarchist.nexus · 1 day ago

Try https://github.com/tphakala/birdnet-go - a fork that looks a little more modern but still feeds Cornell.

CrocodilloBombardino@piefed.social · 4 days ago

if it were built outside of a capitalist system and in a way that is in ecological balance with the planet

Sanctus@anarchist.nexus · 4 days ago

Getting a repository where you commit your own work under legal threat to be trained on. Then tear the data centers down and only allow them to run locally. But its kind of too late for all that. Pandora’s box has been opened.

cloudy1999@sh.itjust.works · edit-2 4 days ago

I despise “AI” for quite a few reasons: It’s built on theft, it empowers the fascists and oligarchs, its masters seek to dis-empower or replace human workers and creatives, its name is a deception as well as its primary use case, etc. This community doesn’t need a rehash. I personally despise AI because I love the programming craft and I worry about a future where code is only generated, or worse: generated autonomously. Don’t get me started on “AI first” companies. Fuck that.

“AI” is an anti-human technology.

Now, separate “AI” and all its awfulness from LLM as an algorithm/data structure. Can LLMs be ethical? I honestly don’t know whether the good can be isolated from the bad. I started to brainstorm this out below, but the more I write, the less convinced I am that there’s a middle way. I’m afraid that much of the perceived benefit of LLMs is derived from the universal theft of training data.

Dear reader, please consider the following a brainstorm only from a non-expert Anti who’s trying in good faith to find a path.

–

Here are some possible ethical use cases:

Natural Language Interface - Like a Terminal Interface (TUI) or Graphical User Interface (GUI) or Command Line Interface (CLI), but instead discerns user intent from human language
Pattern Recognition - Some of LLMs’ legitimate accomplishments have been their ability to pore over decades of human work and detect patterns that otherwise would have been missed. Examples: Recent Erdős and Knuth news. LLMs are reasonable at code review and bug/security flaw detection
Summarization/Search - LLMs and their precursors have been rehashing summaries of well-tread topics in training data for years. Crafting summaries for human consumption seems a ‘ok’ use case, with the understanding than hallucinations are unavoidable. Examples: API documentation, code examples, encyclopedia-like snippets

IMO, an ethical LLM solution might have attributes like these. Disclaimer: I’m not an expert so some of this may be nonsense (“brainstorm”):

Public audit trail of training data
Author consent, voluntary or paid, for participation in training data
Harnesses should have a query-able manifest of valid operations. All user input should map to one of them
Harnesses should strictly require human acknowledgement before executing an operation, and especially when interacting with external systems
Human-first output - should encourage human learning and thought, not seek to replace it
Signed output - this one is tricky. I don’t know how to accomplish it. It would be great if LLM output could be signed in a way that excluded it from future training. The signature would also serve as notice to humans that the content is explicitly from an LLM. Web browsers could then have configurations to filter LLM content out so that users can consent to consume it. This solution may not be part of LLMs themselves
Limited topic/training data - imagine an LLM that’s only for recipes or only for a specific programming API or a specific new site. A smaller model should use fewer resources

I have high doubts that these qualities can be achieved due to complexity and cost. Such is the price of legitimacy.

–

OK, that’s all. I’m going back now to stewing in my disdain for “AI”.

minor tweaks

Jared White ✌️ [HWC]@humansare.social · 3 days ago

I don’t agree summarization is an OK use case as a product. Maybe as a one off thing that a user explicitly requests of their own data within an office suite or whatever.

Meanwhile, I’ve been looking at the mincemeat “Google AI mode” makes of my essays, completely changing the meaning and giving people false conclusions which misrepresent my position. It’s shockingly awful.

ladybugs@lemmy.world · 2 days ago

Yes, Google’s AI features are horrifying for so many reasons!

I can’t fathom all of the hallucinated information spread by Google alone, often to people who weren’t even trying to use AI but got an AI overview at the top of their results anyway. Google AI mode is just creating more BS that most people will never notice because they won’t check the original source of the information.

cloudy1999@sh.itjust.works · 3 days ago

Yes, they cannot reason at all, despite clever marketing names like ‘reasoning models’. A responsible operator must verify all output, something humans can’t collectively be trusted to do. Even when verification is performed, we must ask ourselves if ‘old-fashioned’ thinking wouldn’t have given a just as good or better result. IMO, it’s hard to find anything positive about this technology.

Something related I’ve been thinking about: they’re unable to produce truth or lies, only output.

ladybugs@lemmy.world · 2 days ago

You hit the nail on the head. They produce output that mimics the appearance of a thoughtful response, but isn’t that at all. LLMs do not actually think and do not have any concept of truth.

This is probably why things like ClickUp naming their AI tool “Brain” annoys me so much. It’s designed partially as a way for organizations to get aggregated access to the major LLMs. So yeah, my former coworkers are getting LLM output from “Clickup Brain.” What a marketing scam.

I’ve been wondering how people’s attitudes toward LLMs would shift if society collectively changed the language we used about them to be more accurate. Maybe there wouldn’t be so many people claiming “AI is great for research” and whatnot. Even then, though, I doubt people would fully get past the human tendency to trust confident-sounding language.

SaneMartigan@lemmy.world · 4 days ago

A socialist society built on equity for all. Automating jobs with AI to give people more free time would be great if it wasn’t making a rich minority richer.

CombatWombat@feddit.online · 4 days ago

Anil Dash claims he’s made an ethical AI: https://www.anildash.com/2026/04/28/one-good-ai-is-here/

I haven’t taken the time to verify his claims personally, it sounds like a reasonable attempt:

What’s good? Something that checks every box I can think of for our most immediately positive goals: it’s trained entirely with data that were consensually gathered; it’s completely open source and open weights, so anybody can examine it to know exactly how it works and what biases or flaws it might have; it’s designed to run on ordinary computers that normal people have access to — including those that can run entirely on renewable and responsible energy sources. And it is controlled by creators, not extractors, people who are inarguably on the side of artists and creatives and those who make art and culture in the world, designed to support and enable and empower their expression. No billionaires or guests of Epstein’s island were involved in the creation of this technology.

ladybugs@lemmy.world · 2 days ago

Thank you so much for this! If I get into video editing at some point, I’ll totally use CorridorKey for green screen stuff. It actually does check all of my boxes and more. It’s a very narrow/specific tool, and it’s not something anyone could use to persuade a ChatGPT user to quit, but this gives me hope that other ethical AI tools are possible.

A Sharky Anthro@fedia.io · 4 days ago

As long as moneyed interests are involved with the idea of trying to turn a profit off of LLMs (that they successfully conflated with AI), ethics will never be considered in any capacity. As it stands right now, a real AI is a pipe dream because techbros have shoot their shot way too early; instead of funding multidisciplinary efforts to understand consciousness, the brain, and ways to simulate real reasoning/thought. They’ve created a Frankenstein’s Monster with LLMs, that would never gain any form of intelligence as it cannot possibly replicate the complex consciousness and reasoning process that living things possess.

Realistically, I prefer humanity gives up on this at the moment because the technology to sustain data centers and cool them without severe environmental impact to OUR ONLY HOME PLANET is insufficient. Until we sort out our economic, social, political issues…An actual artificial intelligence should be a low as fuck priority.

JustTesting@lemmy.hogru.ch · 4 days ago

There’s https://apertvs.ai/ probably as close as it gets. Government funded, made by universities. Afaik its datacenter is powered by hydro. But it is an academic project, so still uses common crawl and other publicly available datasets, which is considered ok practice in academia but still means consent is opt out, if something is publicly available. And of course no one uses this, because no marketing and it’s not as ‘good’ as models trained on stolen data.

Plus you could still argue that the energy and tax payer money could be better spent elsewhere.

ladybugs@lemmy.world · 2 days ago

That looks far better than the mainstream AI tools, but I don’t think respecting opt-outs is quite enough. It would be so much better if it were built from solely opt-in training data. As far as I can tell, it’s not attempting to tackle the hallucinations or environmental impact issues. Still, it would be a major change for the better if ChatGPT users switched to something like that.

JustTesting@lemmy.hogru.ch · 1 day ago

yes, agreed. And while right now its environmental impact is low due to renewable energy use, I have no doubt that if this caught on and had to serve more users, the environmental part would get just as bad as other LLMs.

ratrace@lemmy.zip · 4 days ago

Universities are the tip of the baby killing military industrial complex. you people are all so silly. There is no such thing as ethical AI.

FatherPeanut@pawb.social · 2 days ago

We live in capitalism, nothing doesn’t contribute to the orphan crushing machine, but we can still try.