ML Research Internship at TikTok

One of the best world-class experiences of my life. Not just from the learning perspective but also from the perspective of work culture, support, and diversity. Before I go into the details about my work during the internship, I first need to introduce you to the company and organizational structure.

Bytedance is a parent company of TikTok. So, every TikTok employee is by default an employee of Bytedance. Apart from TikTok, Bytedance owns Douyin, Tautiao, Helo, Lark. Most of the used applications at Bytedance are in-house developed. As Google has its own cloud where they host Google docs, sheets, drive, etc, Bytedance has its own cloud and they host Lark docs, sheets, drive, etc. All of their applications have a one-click translation feature with which you can translate a complete doc into another language. This feature is mostly useful because Bytedance has a global team and many people from China closely work with people in the USA or other countries.

Technical teams are divided into many super teams like Applied ML, Data Engineering, Search, etc. These super teams have many internal divisions based on their role and responsibilities. For example, Applied ML (AML) team is divided into AML-Data, AML-RecSys, AML-Innovation, etc. I was closely working with the AML innovation team which does extensive research to develop and innovate new features for different Bytedance products.

Project

I was working on TikTok Ads which is part of a larger recommendation system that drives TikTok. The entire recommendation system (RecSys) for TikTok has a very complex architecture and uses many ML models for different use cases.

I was working on one of such models for Ads ranking. When users watch the videos on TikTok, the app doesn't just show videos. It also shows the ads in between. The ads ranking model sorts the ads in a personalized manner so that the ads shown are something that interests the user. My task was to improve the ads ranking model with possible new techniques. The problem was open-ended. I had to identify existing problems with the model and probable solutions which can solve the problem.

Since I cannot describe in detail the model here because of confidentiality, I can specify some research papers to go through if in case you are interested in learning about RecSys and Ads.

Daily Research Work

Typical industry research focuses more on creating an impact on business metrics like revenue, user retention, user experience, etc. Even if the small change in the model is able to create a significant impact on business, it is really good from a research perspective. My work on the ads ranking model was somewhat similar. Every day, I was starting many experiments w.r.t different methods. It usually used to take around a day to train a new model with experimentation methods and to get new results. Based on the results of the experiments, I would mutate the method or move on to another method. One of the best pieces of advice I received from my intern manager about the research is that:

In ML research, one of the best skills you can cultivate is deciding what next to experiment on once you get the results of a certain experiment.

I was having daily 30 minutes 1:1 connects with my intern manager who is a machine learning engineer at TikTok. During these meetings, we would discuss the findings of the experiments and brainstorm on why certain things work or don't work. Most of these intuitions are called hypotheses in the language of mathematics. Then we do experiments to verify these hypotheses. The objective of the research is to create an impact on business and we can't do so without building enough understanding of data and how the model behaves for that data. Hypotheses verification helps in understanding the model. With that understanding, we can discover new ideas and methods to test the model.

By the end of my internship, we were able to improve the ads ranking model significantly for multiple global regions. I can't really discuss the exact figures but I can say that the improvement was good enough to launch the change and will serve live users starting from mid august.

Machine Learning Infrastructure

ML infrastructure at Bytedance is one of the best and is somewhat similar to other big companies like Meta (Facebook), Google, and Apple (as described by other engineers who were previously working at these companies). Most of the model training and serving pipelines are scaled enough to train and serve models of any size with data of any size.

Their model training platforms have a great user experience for ML engineers. It removes the overhead of analyzing the model results for the engineers and auto-analyze important metrics of the model.

Evaluations & Return Full-time Offer

Bytedance is transparent when it comes to evaluating its employees. As interns, we were getting evaluated by our intern managers and other colleagues who are working on similar projects. Overall, evaluations gave me in-detail feedback about my work, communication skill, and areas of improvement. I was able to stay on track with the work because of valuable constant feedback from my intern manager and other peers.

Bytedance gives return offers to the interns if they perform well. They are currently deciding on these things and we will get to know about the return offer decision by the first week of September 2022.

Edit: Today is 3rd October 2022. Last week recruiter scheduled a zoom call with me and told me that they are providing a return full-time offer as Machine Learning Engineer in Mountain View, CA office. I am more than grateful to all the people who have been part of my journey till now.

As a giveaway, I want to describe some of the things which I focused on during my internship to convert it to a full-time role:

At the start of the internship, I clearly expressed to my intern manager that I am open to feedback. At any point, if he feels that I am working too slowly or I am not giving quality work, he can tell me immediately.
I was not required to attend internal team discussions and daily sync up but I was still attending those meetings in order to know about other ongoing projects and to communicate with other team members.
Applied ML team at Bytedance has very well documented the details of the research till now. I read about different ML models, infrastructure solutions, paper summaries, etc. and it helped me a lot in the research about the project assigned to me.
During mid evaluation and final evaluation of the internship, I had to give a presentation about the progress of the project. I tried structuring the project such that I could talk about the progress depicting specific reasons. For example, I started with method X but it created problem P. To solve P, I experimented with method Y by tuning parameter L. Most ML Engineers and Data Scientists love these kinds of presentations which are shared like a story.
I also used to talk to other fellow interns in order to know more about their work as well as know their approaches to converting this internship to a full-time role.
And last but not least, I was very aggressive in doing experiments. In ML research, getting results is not in your hand but consistently doing quality experiments is. I used to run on average 15-20 experiments per day initially which increased to 30-35 experiments per day during the last few weeks of the internship.

Work Culture

The company culture is very diverse, knowledge-enriching, and super-supportive. The company has a flat employee structure so nobody is your boss. Everyone work like peers. The diversity and inclusion team at TikTok organizes TedToks (the word is inspired by TikToks) where experienced senior folks in the company would share their experience and give advice to new employees.

Before joining the internship, I read reviews on Blind that TikTok's work culture is not good and it doesn't support work-life balance. But my experience was almost the reverse of that. During my job, I was learning many great things and I never felt exhausted because of overdoing work or because of pressure from peers. I had the freedom to message anyone from the team or company if I need any help. So, I don't think there is any issue of work-life balance at least for the machine learning team at TikTok.

Also, the company trusts its employees. As an intern, I never expected that I would get to work on a project which can directly impact the product. Trust, help, and excitement are the three factors I always felt working with awesome folks. One of the principles that the company promotes among its employees is AlwaysDay1, which means that employees should work every day with the same excitement as they do on their first day.

Thank you so much for reading!

I (Ashutosh Hathidara) am an artificial intelligence engineer, full-stack developer, and designer. I can be reached via:
Email: ashutoshhathidara98@gmail.com
Twitter: @ashutosh_1919
Github: @ashutosh1919
Discord: DevSense
LeetCode: @layman_brother

TikTok MLE Internship Experience

Project

Daily Research Work

Machine Learning Infrastructure

Evaluations & Return Full-time Offer

Work Culture

Comments

More from this blog

React Awesome Shapes

Growing Algorithmic Skills for Coding Interviews

TikTok - ML Internship Interview Experience

Fetching GitHub repository Traffic data using API v3

Command Palette

Project

Daily Research Work

Machine Learning Infrastructure

Evaluations & Return Full-time Offer

Work Culture

Comments

More from this blog