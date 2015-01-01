About the Team

Our team is responsible for the “post-training” or alignment of chatGPT. We integrate various improvements from the rest of the company into our RLHF process ultimately producing the models used by millions of users both in the chatGPT product and API.

About the Role

One of the most important parts of training chatGPT is building and training on extremely high quality datasets. We are looking for somebody to help us build infrastructure to manage this data! In contrast to most data engineering, dataset size is not the key factor here – instead we aim to bring more insight and continually increase the quality of our training data.

Ideal candidates should have a strong technical background and general knowledge. Given how coupled our data systems are with the underlying models, candidates should have some familiarity with ML / ML Engineering either in a research context, or in an applied ML setting.

This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

In this role, you will:

Build systems and tools for researchers to look at and transform datasets.

Co-design and build experimental primitives used to construct data pipelines to train prototype chatGPT models.

Work with the chatGPT product team building distributed pipelines to look at and understand large scale usage data.

Help with other, more out there research ideas involving data pipelines.

Help own the entire training distribution we train ChatGPT on.

You might thrive in this role if you:

Are a team player – willing to do a variety of tasks that move the team forward.

Experience working in complex technical environments

Enjoy working in a more research setting – these data systems are new and the right solution is often not clear ahead of time.

Experience with the python

Experience with kubernetes / distributed infrastructure

Experience with 1 or more large scale data system such as beam or spark.

Compensation, Benefits and Perks Total compensation also includes generous equity and benefits. Medical, dental, and vision insurance for you and your family

Mental health and wellness support

401(k) plan with 50% matching

Unlimited time off and 18+ company holidays per year

Paid parental leave (20 weeks) and family-planning support

Annual learning & development stipend ($1,500 per year) Annual Salary Range $310,000 — $385,000 USD