AI의 창시자 중 한 명인 얀 르쿤이 이런 말을 했단다.
인간은 하나의 판단 시스템인데, 그 이유는 인간의 인식과 기억, 세계를 이해하는 모델, 목표 설정, 윤리적 판단이 서로 맞물려 행동을 결정하기 때문이라고 한다.
그런 인간이 착각하고, 어리석고, 무능하고, 악하게 행동하는데 그렇게 실패하는 원인을 다섯 가지 제시한다.
1. 잘못된 지각: 세상을 잘못 봤거나, 기억이 틀렸다.
2. 잘못된 세계 모델 : 원인과 결과를 잘못 연결했다.
3. 비효율적 전략: 옳은 길을 알면서도 엉뚱한 길을 택했다.
4. 그릇된 목표 설정: 본질이 아니라 욕망을 따라갔다.
5. 결함 있는 윤리 기준: 해를 끼쳐도 개의치 않았다.
인간은 때때로 잘못해서가 아니라, 알고도 나쁜 선택을 한단다. 윤리가 결핍돼서가 아니라, 의도적으로 무시하기도 하는데, 인간의 악의는 단순한 시스템 오류가 아니라, 욕망과 권력, 책임 회피의 산물이기 때문이라고 한다.
아래는 논문 일부다.
Five Ways to Act Deluded, Stupid, Ineffective, or Evil
Yann LeCun 2025-04-28
[semi-humorous, geeky political satire ahead]
Introduction
Cognitive Science has proposed various models of how humans and animals perceive the world, reason about a situation, plan actions to accomplish tasks, and make decisions.
AI architectures resulting from attempts to produce AI with human-level intelligence may help us understand how humans act the way they do, and why they sometimes act deluded, stupid, ineffective, or evil. Considering one such architectural concept, one can analyze how intelligence can fail.
A Model of Human Behavior
Humans, and many animals, have mental models of their environment that allow them to predict how their environment evolves, and to predict the effect of their own actions. Given a mental representation of the current state of the world, resulting from perception and memory, such a mental model of the world predicts plausible future states of the world resulting from an imagined action or sequence of actions.
To plan a course action in order to accomplish a task, an intelligent system sets a task objective which evaluates to what extent the predicted future state corresponds to an accomplished task. Think of the task objective as a kind of scoring function that produces a low value (e.g. 0) when the predicted state corresponds to an accomplished task, and a larger value measuring a kind of distance to the accomplishment of the task.
When planning a course of actions, the system searches for an action sequence that minimizes the task objective. In addition to the task objective, the system also minimizes other objectives that can be seen as guardrails. They ensure that the effect of actions are safe and beneficial. One can think of those guardrails as a kind of “moral conscience” of the system.
How to be deluded, stupid, ineffective, or evil
There are five basic ways such an intelligent system can be stupid, ineffective, or evil.
Delusion from inaccurate perception or incomplete memory: an inaccurate perception module (e.g. partial observation, false information) will produce an inaccurate representation of the current state of the world. A good estimate of the state of the world also requires to rely on memory for the parts of the world that are not currently perceived. The memory may be defective.
Stupidity through inaccurate world model: the world model may make inaccurate or incomplete predictions about the effect on the world of a course of actions. An inaccurate world model results from a lack of understanding of causes and effects.
Ineffective search for actions: Finding a course of actions that optimizes the task objective while satisfying the guardrails is a difficult problem. An intelligent system’s ability to act in an effective way may be limited by a suboptimal search strategy, even if the perception, the world model and the objectives are accurate.
Stupidity or evil through improper task objectives: The task objective may not properly characterize a good solution, in which case the system will produce an action plan that does not accomplish the desired task. Additionally, the task objective may accidentally or deliberately be biased in ways that cause adverse effects on the world in the long run.
Evil through defective guardrail objectives: guardrail objectives prevent the system from accomplishing tasks at all costs, possibly causing destructive affects as a consequence. Improper guardrails may cause harm.
Examples
Here are a few examples of each failure case.
Let’s imagine that (a) an agent stands on one side of a small stream and wants to jump to the other side and (b) an agent wants to maximize the economic welfare of a country.
Delusion from inaccurate perception or incomplete memory:
The agent under-estimates the distance to the other side.
The agent has inaccurate economic data: trade balance, deficit, etc.
Stupidity through inaccurate world model:
The agent incorrectly believes that a particular leg action will allow it to jump the required distance, or doesn’t take into account the fact that the ground is wet, soft, and slippery.
The agent incorrectly believes that imposing tariffs are a way to equalize the trade balance, that trade partners will not retaliate in kind, and that his country's economic health will not be harmed in the process.
Ineffective search for actions:
The agent chooses to jump from the muddy patch, ignoring the nearby flat stone that would constitute a perfect jumping platform and ignoring the bridge a short distance away.
The agent has a correct model of trade economics, but is fixated on a particular course of actions that everyone knows is suboptimal.
Stupidity or evil through improper task objectives:
The agent does not actually need to jump to the other side of the stream to get to the desired destination. The agent is motivated by impressing the audience with jumping skills.
The agent does not actually want to maximize the economic welfare of the country. The agent is motivated by asserting his power and by hurting his political and personal enemies.
Evil through defective guardrail objectives:
The agent doesn’t care that if by jumping to the other side, he lands on someone’s feet.
The agent doesn’t care that if by taking a course of actions, he destroys the livelihoods of millions of his fellow citizens.
| [문래동 맛집] 곱 문래본점 (곱창 싫어하는 사람도 먹을 수 있음) (6) | 2024.09.02 |
|---|---|
| 불스원샷 플래티넘 사용후기 (0) | 2024.08.21 |
| [이수역 맛집] 늘조은 한우마을(+ 사진 보고 가세요) (1) | 2024.08.19 |
| [맛집] 송추가마골 인 어반 영등포점 와인 받는 법 (+ 사진 많음) (2) | 2024.04.20 |
| [맛집] 정성돈가스 용산 본점(+메뉴판, 가격, 주문 시 참고할 점 +사진 많음) (1) | 2024.04.04 |
댓글 영역