The unstoppable advance of artificial intelligence: the challenge of finding new evaluation standards
Artificial intelligence (AI) has reached a critical point in its development, often surpassing human performance in basic tasks such as reading comprehension, image classification and competitive-level mathematics. This equitable advancement has rendered numerous conventional assessment standards outdated, creating a compelling need to develop new ways to evaluate the abilities of AI systems. Stanford University's AI Index Report 2024 highlights this progress and the need to develop renewed benchmarks to effectively assess these innovative systems.
The rapid evolution of artificial intelligence
The Stanford report illustrates the methodical growth of AI tools over the last ten years. It is clear that artificial intelligence has progressed at a staggering pace, as Nestor Maslej, the senior editor of the Artificial Intelligence Index, points out. Models that were once seen as innovative are now perceived as obsolete and restricted, spurring the need to modify evaluation criteria towards more complex activities such as abstraction and thinking. Artificial Intelligence is no longer merely a predictive tool; it has become a partner in solving scientific problems and exploring new resources, as evidenced by initiatives such as GNoME and GraphCast.
Ethical concerns and the costs of progress
As the abilities of Artificial Intelligence (AI) increase, so do the costs associated with its development and maintenance.
Advanced models such as GPT-4 have achieved training costs of up to $78 million, reflecting the significant economic investment required.
Google, with its Gemini Ultra project, has invested close to $191 million, demonstrating the commitment and scale at which large technology companies are operating.
There is growing concern about the increased use of energy and water resources needed to maintain data centers, which poses environmental challenges.
There is a well-founded fear that high-quality training data may run out before the end of this decade, which could compromise future AI development.
Ethical concerns about the impact of AI continue to grow, prompting debates about how to regulate and control its evolution.
A Stanford report highlights the international divide in the perception of AI; some countries see it as a positive development, while others are skeptical and concerned about its possible negative consequences.
The role of industry and the need for ethical standards
The crucial importance of industry in the progress of Artificial Intelligence manifests itself in a remarkable contrast between industry and academia in terms of generating new systems.
In the previous year, industry developed 51 outstanding machine learning systems, while academic scientists produced 15.
This disparity has led to a change in the academic approach, encouraging critical examination of industrial systems.
A key component in these assessments is the Google-Proof Q&A Benchmark (GPQA).
GPQA is specifically used to measure the visual, mathematical and moral reasoning ability of large-scale language models.
Despite these efforts, there is no consensus on standardized assessments to measure responsible use of AI.
The lack of such complicated assessments creates difficulties when comparing systems in relation to possible risks.
It is crucial that agreements be reached to improve the comparison and understanding of the long-term impact of these systems.
The "AI Index 2024" report emphasizes the importance of establishing new evaluation criteria that adjust to the rapid advancement of artificial intelligence. This challenge is not only to establish criteria that measure more complex skills, but also to incorporate ethical evaluations that promote the conscious use of Artificial Intelligence. In the face of rising costs and ethical concerns, it is imperative that the global community unify in the development and adoption of these new parameters, thus ensuring that the Artificial Intelligence revolution brings benefits to humanity in a safe and sustainable manner.