Evaluation, Anticipation and Control
Valencia, 8th March 2023
Find the recording of the whole event here!
"Predictable AI: Evaluation, Anticipation and Control" is a singular event that will consist of invited talks, panels, short highlights and time for networking. In the morning, we will put the emphasis on “Predictable AI Futures” around scaling laws, control, liability and future risks. In the afternoon, we will focus on “Predictable AI Systems”: covering cognitive and robust evaluation, assessors, co-operative conditions, uncertainty estimation, etc.
8:30h - Accreditation (badges)
9:00h - Opening
Salvador Coll (Vicechancellor for Innovation and Transfer - UPV): Welcome
Jose H. Orallo (UPV): Goals and agenda of the day
9:15h - Keynote (60' including Q/A). Chair: Jose H. Orallo (UPV).
Irina Rish (Mila, U. Montreal): "On Scaling Laws, Emergent Behaviors, and AI Democratization Efforts". Abstract: Large-scale unsupervised pre-trained models, a.k.a. “Foundation models”, are taking the AI field by storm, achieving state-of-art performance and impressive few-shot generalization abilities on a variety of tasks in multiple domains. Clearly, predicting the performance and other metrics of interest (robustness, truthfulness etc) at scale, including potential emergent behaviors, is crucial for (1) choosing learning methods that are likely to stand the test-of-time as larger compute becomes available, and (2) ensuring safe behavior of AI systems via anticipating potential emergent behaviors (“phase transitions”). We investigate both an “open-box” approach, when the access to learning dynamics and internal metrics of a neural network are available (e.g., in the case of “grokking” behavior), as well as “closed-box” approach where the predictions of future behavior must be made solely based on the previous behavior, without internal measurements of the system being available. We present a generic predictive framework os Broken Neural Scaling Laws that allows to extrapolate an extremely large range of behaviors, assuming you have enough training runs before a break to extrapolate what happens until the next sharp break. However, while recent developments in foundation models are highly promising, and advances in neural scaling laws allow for accurate predictions of their behaviors, a serious challenge for academia and non-profit/open-source AI in terms of being competitive in the area of foundation models is the lack of large-scale compute resources available to industry. This motivated us – a rapidly growing international collaboration across several Universities and non-profit organizations – to join forces and initiate an effort towards developing common objectives and tools for advancing the field of large-scale foundation models, and obtaining large-scale compute funded by either governments or private entities. Our long-term, overarching goal is to develop a wide international collaboration united by the objective of building foundation models that are increasingly more powerful, while at the same time are safe, robust and aligned with human values.
10:15h - Coffee break (in the hall)
10:45h - Invited Talks (35' each, including Q/A). Chair: Fernando M. Plumed (UPV)
Emilia Gómez (JRC, European Commission): "Liability regimes in the age of AI". Abstract: New emerging technologies powered by Artificial Intelligence (AI) have the potential to disruptively transform our societies for the better. In particular, data-driven learning approaches (i.e., machine learning) have been a true revolution in the advancement of multiple technologies in various application domains. But at the same time there are growing concerns about certain intrinsic characteristics of these methodologies that carry potential risks to both safety and fundamental rights. Although there are mechanisms in the adoption process to minimize these risks (e.g. safety regulations), these do not exclude the possibility of harm occurring, and if this happens, victims should be able to seek compensation. Liability regimes will therefore play a key role in ensuring basic protection for victims using or interacting with these systems. However, the same characteristics that make AI systems inherently risky, such as lack of causality, opacity, unpredictability or their self and continuous learning capabilities, lead to considerable difficulties when it comes to proving causation. This talk presents case studies such as clearing robots, delivery drones and robots in education, as well as the methodology to reach them, that illustrate these difficulties. The outcome of the proposed analysis suggests the need to revise liability regimes to alleviate the burden of proof on victims in cases involving AI technologies.
Seán O'hEigeartaigh and Alex Marcoci (U. Cambridge): "How predictable is the future of AI?". Abstract: AI is set to impact nearly every industry and sector of society, and rapid continued progress is expected. Some experts predict the possibility of ‘transformative’ AI in coming decades - AI systems that would drive a global transformation of the magnitude of the industrial revolution - while others are sceptical. Growing efforts are going into forecasting the future of ‘frontier’ AI systems: extrapolating trends around performance and costs, predicting when milestones will be achieved, and anticipating societal impacts and risks. This talk will survey some past and present forecasting initiatives, and will pose questions about the role of forecasting. To what extent would it have been possible to predict present successes and challenges in AI? In what ways are forecasting useful for research and governance? What questions are most useful to answer?.
11:55h - Lightning talks with Solutions and Ideas to the "Board Problems". (1' min each, no Q/A). Chair: Jose H. Orallo
Speakers: Alexey Dubinsky, Lexin Zhou, Konstantinos Voudouris, Anna Wisakanto, Alexandre Bretel, Serhiy Kandul, Nikolaos Prodromos, Behzad Mehrbakhsh.
12:05h - Panel: “Making the future of AI predictable” (55'). Moderator: John Burden (U. Cambridge)
Panellists: Irina Rish (U. Montreal), Seán O’Heigeartaigh (U. Cambridge), Lawrence Phillips (Metaculus), Serhiy Kandul (U. Zurich).
13:00h - Lunch break (in the hall)
14:30h - Keynote (60' including Q/A). Chair: Cèsar Ferri (UPV).
Joel Leibo (DeepMind): "Evaluating and Improving Cooperative Artificial Intelligence". Abstract: Problems of cooperation—in which agents seek ways to jointly improve their welfare—are ubiquitous and important. They can be found at scales ranging from our daily routines—such as driving on highways, scheduling meetings, and working collaboratively—to our global challenges—such as peace, commerce, and pandemic preparedness. In this talk I'll discuss methods we've been developing to assess how well artificial intelligence systems navigate cooperation problems and some of our efforts to improve their cooperative capabilities.
15:30h - Minibreak
15:40h - Invited Talks (35' each, including Q/A). Chair: Danaja Rutar (U. Cambridge)
Ida Momennejad (Microsoft Research): "A Rubric for Human-like Agents and NeuroAI". Abstract: Researchers across cognitive, neuro- and computer sciences increasingly reference ‘human-like’ artificial intelligence and ‘neuroAI’. However, the scope and use of the terms are often inconsistent. Contributed research ranges widely from mimicking behaviour, to testing machine learning methods as neurally plausible hypotheses at the cellular or functional levels, or solving engineering problems. However, it cannot be assumed nor expected that progress on one of these three goals will automatically translate to progress in others. Here, a simple rubric is proposed to clarify the scope of individual contributions, grounded in their commitments to human-like behaviour, neural plausibility or benchmark/engineering/computer science goals. This is clarified using examples of weak and strong neuroAI and human-like agents, and discussing the generative, corroborate and corrective ways in which the three dimensions interact with one another. Future progress in artificial intelligence will need strong interactions across the disciplines, with iterative feedback loops and meticulous validity tests—leading to both known and yet-unknown advances that may span decades to come.
Lucy Cheke (U. Cambridge): "A Comparative Cognition Approach to AI Evaluation". Abstract: Understanding and predicting behaviour has been the business of psychologists for over a century. Within human psychology we can rely to some extent on introspection to understand the underlying drivers of behaviour, but this is less straightforward with animals. The problem of peering inside the "black box" of nonhuman animals shares much with the challenge of understanding the capabilities of AI systems - which exhibit extraordinarily - clever-seeming - behaviour, but are prone to inflexibility and shortcuts. This talk will review the comparative cognition approach to AI evaluation and the benefits of robust cognitive testing of AI both to understanding AI itself, but also for exploring biological intelligence.
16:50h - Panel: "How predictable can AI systems be?" (55'). Moderator: Ryan Burnell (U. Cambridge)
Panellists: Joel Leibo (Deepmind), Lucy Cheke (U. Cambridge), Peter Flach (U. Bristol), Ali Boyle (London School of Economics)
17:45h - Closing remarks (and induction to the session following...)
17:55h - Orxata and drinks in the hall: chat in groups freely or around the boards, each of them highlighting one “Board Problem” of Predictable AI, at the perusal of the lightning-talk speaker (the "champion").
Selected Board Problems and their champions:
BP4: Alexey Dubinsky + Wout Schellaert
BP5: Lexin Zhou + Yael Moros + Fernando Martínez-Plumed
BP6: Konstantinos Voudouris + RECoG-AI team
BP7: Anna Wisakanto
BP8: Alexandre Bretel
BP12: Serhiy Kandul
BP18: Nikolaos Prodromos
BP20: Behzad Mehrbakhsh
The event is co-organised and will count with a numerous representation of colleagues from the CFI and CSER in Cambridge, and ValGRAI in Valencia.
Main contact person:
"Local" (Valencia/Cambridge) organising committee: Ana Cidad, Cèsar Ferri, Nando Martínez-Plumed, Danaja Rutar, John Burden, Ryan Burnell, Konstantinos Voudouris, Wout Schellaert, Behzad Mehrbakhsh, Lexin Zhou and Yael Moros.
Ciutat Politècnica de l'Innovació, Cub Blau Auditorium ("blue cube" on the top, building 8B, take the lift from basement to top floor or walk up the stairs to access from terrace), Universitat Politècnica de València, València, Spain
Travelling, hotels and other information about València at www.visitvalencia.com/en.
Registration extended, event now moved to a bigger auditorium, Cub Blau (blue cube), (capacity > 200 people).