Summer 2023 Cohort applications have closed

Our mission

The MATS program aims to find and train talented individuals for what we see as the world’s most urgent and talent-constrained problem: reducing risks from unaligned artificial intelligence (AI). We believe that ambitious young researchers from a variety of backgrounds have the potential to meaningfully contribute to the field of alignment research. We aim to provide the training, logistics, and community necessary to aid this transition. We also connect our scholars with financial support to ensure their financial security. Please see our theory of change for more details.

Program details

MATS is a scientific and educational seminar and independent research program, intended to serve as an introduction to the field of AI alignment and allow networking with alignment researchers and institutions. Read more about the program timeline and content in our program overview.

The MATS program is a joint initiative by the Stanford Existential Risks Initiative and the Berkeley Existential Risk Initiative, with support from Conjecture and Lightcone Infrastructure. We receive financial support from several sources, including the Survival and Flourishing Fund.

Who is this program for?

Our ideal applicant has:

  • an understanding of the AI alignment research landscape equivalent to having completed the AGI Safety Fundamentals course;

  • previous experience with technical research (e.g. ML, CS, maths, physics, neuroscience, etc.), ideally at a postgraduate level;

  • strong motivation to pursue a career in AI alignment research, particularly to reduce global catastrophic risk.

Even if you do not entirely meet these criteria, we encourage you to apply! Several past scholars applied without strong expectations and were accepted.

How to apply

MATS will run several concurrent streams, each for a different alignment research agenda. Read through the descriptions of each stream and the associated candidate selection questions below. To apply for a stream, submit an application via this portal, including your resume and a response to the appropriate candidate selection questions. We will assess your application based on your mentor response and prior research experience.

Please note that the candidate selection questions can be quite hard, depending on the mentor! Allow yourself sufficient time to apply to your chosen stream/s. A strong application to one stream may be of higher value than moderate applications to several streams (though we will assess you independently). Feel free to apply for multiple streams—we will assess you independently for each.

Applications for the Summer 2023 Cohort are now closed!

Program streams

  • Aligning language models

    Ethan Perez (Anthropic)

    Current ML models that predict human language are surprisingly powerful and might scale into transformative AI. What novel alignment failures will future models exhibit, how can we develop demonstrations of those failures, and how can we mitigate them?

  • Agent foundations

    John Wentworth

    Some systems in the world seem to behave like “agents”: they make consistent decisions and sometimes display complex goal-seeking behavior. What are the necessary components of such systems, and can we predict their emergence and behavior?

  • Consequentialist cognition and deep constraints

    Vivek Hebbar (MIRI)

    Many alignment proposals might be compromised by a “sharp left turn” of powerful consequentialist AI systems. How strong are these constraints on alignment proposals, and can they inspire promising research directions?

  • Cyborgism

    Janus, Nicholas Kees Dupuis

    Rather than training autonomous AI systems to aid alignment research, we might increasingly empower human researchers with highly integrated AI augmentation. How can existing models be adapted for this purpose, and what new technologies are required?

  • Deceptive AI

    Evan Hubinger (Anthropic)

    Powerful AI systems may be instrumentally motivated to secretly manipulate their training process. What ML training processes and architectures might lead to this deceptive behavior, and how can it be detected or averted?

  • Evaluating dangerous capabilities

    Owain Evans (FHI)

    Evaluating ML models for potentially dangerous capabilities might help prevent unsafe deployment or otherwise aid alignment. Can we test language models for capabilities like situational awareness and deception, or predict their emergence?

  • Interdisciplinary AI safety

    Dan Hendrycks (CAIS)

    Interdisciplinary research aims to adapt knowledge from one academic discipline to help solve problems in another. Can we leverage the accumulated knowledge of academic fields beyond ML and computer science to benefit AI safety?

  • Mechanistic interpretability

    Neel Nanda (DeepMind), Lee Sharkey (Conjecture)

    Rigorously understanding how ML models function may allow us to identify and train against misalignment. Can we reverse engineer neural nets from their weights, similar to how one might reverse engineer a binary compiled program?

  • Multipolar AI safety

    Jesse Clifton (CLR), Daniel Kokotajlo (OpenAI)

    The world may soon contain multiple powerful AI systems with multiple human stakeholders. How can we ensure that multipolar AI-human systems appropriately represent the joint interests of humans and their institutions, and coordinate to avoid bad outcomes?

  • Powerseeking in language models

    Victoria Krakovna (DeepMind)

    Established theories of emergent agency and powerseeking in ML models are derived from the reinforcement learning setting. If AGI arises from a pre-trained language model, what aspects of powerseeking might be evident?

  • Shard theory

    Alex Turner (CHAI)

    Rather than producing models aligned towards a single goal, reinforcement learning might instill a complicated web of heuristics and proxy goals. How might these "shards of agency" be predicted, observed, and steered?

  • Understanding AI hacking

    Jeffrey Ladish

    Current and near-term language models have the potential to greatly empower hackers and fundamentally change cybersecurity. How effectively can current models assist bad actors, and how soon might models be capable of hacking unaided?

Mentors

Scholars conduct independent research in a stream led by an established alignment research mentor. Our current mentor list is here.

Alumni

MATS alumni have gone on to publish safety research, join alignment research teams, including at Anthropic and MIRI, and found three alignment research teams. Our alumni spotlight is here.

Seminar program

MATS runs a series of educational seminars to provide scholars with a broad understanding of the field of alignment. Our Winter 2022 Cohort seminar program is available here.

FAQs

We answer many frequently asked questions by applicants and new scholars here. If you have a question not covered by this list, please use our website contact form.