English
Related papers

Related papers: Phi-4 Technical Report

200 papers

We introduce Phi-4-reasoning, a 14-billion parameter reasoning model that achieves strong performance on complex reasoning tasks. Trained via supervised fine-tuning of Phi-4 on carefully curated set of "teachable" prompts-selected for the…

We introduce phi-1, a new large language model for code, with significantly smaller size than competing models: phi-1 is a Transformer-based model with 1.3B parameters, trained for 4 days on 8 A100s, using a selection of ``textbook quality"…

We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5…

Computation and Language · Computer Science 2024-09-04 Marah Abdin , Jyoti Aneja , Hany Awadalla , Ahmed Awadallah , Ammar Ahmad Awan , Nguyen Bach , Amit Bahree , Arash Bakhtiari , Jianmin Bao , Harkirat Behl , Alon Benhaim , Misha Bilenko , Johan Bjorck , Sébastien Bubeck , Martin Cai , Qin Cai , Vishrav Chaudhary , Dong Chen , Dongdong Chen , Weizhu Chen , Yen-Chun Chen , Yi-Ling Chen , Hao Cheng , Parul Chopra , Xiyang Dai , Matthew Dixon , Ronen Eldan , Victor Fragoso , Jianfeng Gao , Mei Gao , Min Gao , Amit Garg , Allie Del Giorno , Abhishek Goswami , Suriya Gunasekar , Emman Haider , Junheng Hao , Russell J. Hewett , Wenxiang Hu , Jamie Huynh , Dan Iter , Sam Ade Jacobs , Mojan Javaheripi , Xin Jin , Nikos Karampatziakis , Piero Kauffmann , Mahoud Khademi , Dongwoo Kim , Young Jin Kim , Lev Kurilenko , James R. Lee , Yin Tat Lee , Yuanzhi Li , Yunsheng Li , Chen Liang , Lars Liden , Xihui Lin , Zeqi Lin , Ce Liu , Liyuan Liu , Mengchen Liu , Weishung Liu , Xiaodong Liu , Chong Luo , Piyush Madan , Ali Mahmoudzadeh , David Majercak , Matt Mazzola , Caio César Teodoro Mendes , Arindam Mitra , Hardik Modi , Anh Nguyen , Brandon Norick , Barun Patra , Daniel Perez-Becker , Thomas Portet , Reid Pryzant , Heyang Qin , Marko Radmilac , Liliang Ren , Gustavo de Rosa , Corby Rosset , Sambudha Roy , Olatunji Ruwase , Olli Saarikivi , Amin Saied , Adil Salim , Michael Santacroce , Shital Shah , Ning Shang , Hiteshi Sharma , Yelong Shen , Swadheen Shukla , Xia Song , Masahiro Tanaka , Andrea Tupini , Praneetha Vaddamanu , Chunyu Wang , Guanhua Wang , Lijuan Wang , Shuohang Wang , Xin Wang , Yu Wang , Rachel Ward , Wen Wen , Philipp Witte , Haiping Wu , Xiaoxia Wu , Michael Wyatt , Bin Xiao , Can Xu , Jiahang Xu , Weijian Xu , Jilong Xue , Sonali Yadav , Fan Yang , Jianwei Yang , Yifan Yang , Ziyi Yang , Donghan Yu , Lu Yuan , Chenruidong Zhang , Cyril Zhang , Jianwen Zhang , Li Lyna Zhang , Yi Zhang , Yue Zhang , Yunan Zhang , Xiren Zhou

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on…

Computation and Language · Computer Science 2024-03-11 OpenAI , Josh Achiam , Steven Adler , Sandhini Agarwal , Lama Ahmad , Ilge Akkaya , Florencia Leoni Aleman , Diogo Almeida , Janko Altenschmidt , Sam Altman , Shyamal Anadkat , Red Avila , Igor Babuschkin , Suchir Balaji , Valerie Balcom , Paul Baltescu , Haiming Bao , Mohammad Bavarian , Jeff Belgum , Irwan Bello , Jake Berdine , Gabriel Bernadett-Shapiro , Christopher Berner , Lenny Bogdonoff , Oleg Boiko , Madelaine Boyd , Anna-Luisa Brakman , Greg Brockman , Tim Brooks , Miles Brundage , Kevin Button , Trevor Cai , Rosie Campbell , Andrew Cann , Brittany Carey , Chelsea Carlson , Rory Carmichael , Brooke Chan , Che Chang , Fotis Chantzis , Derek Chen , Sully Chen , Ruby Chen , Jason Chen , Mark Chen , Ben Chess , Chester Cho , Casey Chu , Hyung Won Chung , Dave Cummings , Jeremiah Currier , Yunxing Dai , Cory Decareaux , Thomas Degry , Noah Deutsch , Damien Deville , Arka Dhar , David Dohan , Steve Dowling , Sheila Dunning , Adrien Ecoffet , Atty Eleti , Tyna Eloundou , David Farhi , Liam Fedus , Niko Felix , Simón Posada Fishman , Juston Forte , Isabella Fulford , Leo Gao , Elie Georges , Christian Gibson , Vik Goel , Tarun Gogineni , Gabriel Goh , Rapha Gontijo-Lopes , Jonathan Gordon , Morgan Grafstein , Scott Gray , Ryan Greene , Joshua Gross , Shixiang Shane Gu , Yufei Guo , Chris Hallacy , Jesse Han , Jeff Harris , Yuchen He , Mike Heaton , Johannes Heidecke , Chris Hesse , Alan Hickey , Wade Hickey , Peter Hoeschele , Brandon Houghton , Kenny Hsu , Shengli Hu , Xin Hu , Joost Huizinga , Shantanu Jain , Shawn Jain , Joanne Jang , Angela Jiang , Roger Jiang , Haozhun Jin , Denny Jin , Shino Jomoto , Billie Jonn , Heewoo Jun , Tomer Kaftan , Łukasz Kaiser , Ali Kamali , Ingmar Kanitscheider , Nitish Shirish Keskar , Tabarak Khan , Logan Kilpatrick , Jong Wook Kim , Christina Kim , Yongjik Kim , Jan Hendrik Kirchner , Jamie Kiros , Matt Knight , Daniel Kokotajlo , Łukasz Kondraciuk , Andrew Kondrich , Aris Konstantinidis , Kyle Kosic , Gretchen Krueger , Vishal Kuo , Michael Lampe , Ikai Lan , Teddy Lee , Jan Leike , Jade Leung , Daniel Levy , Chak Ming Li , Rachel Lim , Molly Lin , Stephanie Lin , Mateusz Litwin , Theresa Lopez , Ryan Lowe , Patricia Lue , Anna Makanju , Kim Malfacini , Sam Manning , Todor Markov , Yaniv Markovski , Bianca Martin , Katie Mayer , Andrew Mayne , Bob McGrew , Scott Mayer McKinney , Christine McLeavey , Paul McMillan , Jake McNeil , David Medina , Aalok Mehta , Jacob Menick , Luke Metz , Andrey Mishchenko , Pamela Mishkin , Vinnie Monaco , Evan Morikawa , Daniel Mossing , Tong Mu , Mira Murati , Oleg Murk , David Mély , Ashvin Nair , Reiichiro Nakano , Rajeev Nayak , Arvind Neelakantan , Richard Ngo , Hyeonwoo Noh , Long Ouyang , Cullen O'Keefe , Jakub Pachocki , Alex Paino , Joe Palermo , Ashley Pantuliano , Giambattista Parascandolo , Joel Parish , Emy Parparita , Alex Passos , Mikhail Pavlov , Andrew Peng , Adam Perelman , Filipe de Avila Belbute Peres , Michael Petrov , Henrique Ponde de Oliveira Pinto , Michael , Pokorny , Michelle Pokrass , Vitchyr H. Pong , Tolly Powell , Alethea Power , Boris Power , Elizabeth Proehl , Raul Puri , Alec Radford , Jack Rae , Aditya Ramesh , Cameron Raymond , Francis Real , Kendra Rimbach , Carl Ross , Bob Rotsted , Henri Roussez , Nick Ryder , Mario Saltarelli , Ted Sanders , Shibani Santurkar , Girish Sastry , Heather Schmidt , David Schnurr , John Schulman , Daniel Selsam , Kyla Sheppard , Toki Sherbakov , Jessica Shieh , Sarah Shoker , Pranav Shyam , Szymon Sidor , Eric Sigler , Maddie Simens , Jordan Sitkin , Katarina Slama , Ian Sohl , Benjamin Sokolowsky , Yang Song , Natalie Staudacher , Felipe Petroski Such , Natalie Summers , Ilya Sutskever , Jie Tang , Nikolas Tezak , Madeleine B. Thompson , Phil Tillet , Amin Tootoonchian , Elizabeth Tseng , Preston Tuggle , Nick Turley , Jerry Tworek , Juan Felipe Cerón Uribe , Andrea Vallone , Arun Vijayvergiya , Chelsea Voss , Carroll Wainwright , Justin Jay Wang , Alvin Wang , Ben Wang , Jonathan Ward , Jason Wei , CJ Weinmann , Akila Welihinda , Peter Welinder , Jiayi Weng , Lilian Weng , Matt Wiethoff , Dave Willner , Clemens Winter , Samuel Wolrich , Hannah Wong , Lauren Workman , Sherwin Wu , Jeff Wu , Michael Wu , Kai Xiao , Tao Xu , Sarah Yoo , Kevin Yu , Qiming Yuan , Wojciech Zaremba , Rowan Zellers , Chong Zhang , Marvin Zhang , Shengjia Zhao , Tianhao Zheng , Juntang Zhuang , William Zhuk , Barret Zoph

In 2022, with the release of ChatGPT, large-scale language models gained widespread attention. ChatGPT not only surpassed previous models in terms of parameters and the scale of its pretraining corpus but also achieved revolutionary…

Artificial Intelligence · Computer Science 2024-11-13 Yiming Ju , Huanhuan Ma

Most large language models are fine-tuned using either expensive human-annotated data or GPT-4 generated data which cannot guarantee performance in certain domains. We argue that although the web-crawled data often has formatting errors…

Computation and Language · Computer Science 2024-08-16 Jing Zhou , Chenglin Jiang , Wei Shen , Xiao Zhou , Xiaonan He

The burgeoning computational demands for training large language models (LLMs) necessitate efficient methods, including quantized training, which leverages low-bit arithmetic operations to reduce costs. While FP8 precision has shown…

Machine Learning · Computer Science 2025-02-18 Jiecheng Zhou , Ding Tang , Rong Fu , Boni Hu , Haoran Xu , Yi Wang , Zhilin Pei , Zhongling Su , Liang Liu , Xingcheng Zhang , Weiming Zhang

Large language models require vast amounts of high-quality training data, but effective filtering of web-scale datasets remains a significant challenge. This paper demonstrates that GPT-4o is remarkably effective at identifying high-quality…

Computation and Language · Computer Science 2025-02-03 Jifan Zhang , Ziyue Luo , Jia Liu , Ness Shroff , Robert Nowak

The rapid advancement of large language models, such as the Generative Pre-trained Transformer (GPT) series, has had significant implications across various disciplines. In this study, we investigate the potential of the state-of-the-art…

Computation and Language · Computer Science 2023-09-06 Yunhao Yang , Anshul Tomar

Chain-of-Thought (CoT) significantly enhances formal reasoning capabilities in Large Language Models (LLMs) by training them to explicitly generate intermediate reasoning steps. While LLMs readily benefit from such techniques, improving…

Recently published work on rephrasing natural text data for pre-training LLMs has shown promising results when combining the original dataset with the synthetically rephrased data. We build upon previous work by replicating existing results…

Large Language Models (LLMs), typified by OpenAI's GPT, have marked a significant advancement in artificial intelligence. Trained on vast amounts of text data, LLMs are capable of understanding and generating human-like text across a…

Artificial Intelligence · Computer Science 2024-10-29 Haochen Zhang , Yuyang Dong , Chuan Xiao , Masafumi Oyamada

Prior work has shown that finetuning large language models (LLMs) using machine-generated instruction-following data enables such models to achieve remarkable zero-shot capabilities on new tasks, and no human-written instructions are…

Computation and Language · Computer Science 2023-04-07 Baolin Peng , Chunyuan Li , Pengcheng He , Michel Galley , Jianfeng Gao

This paper presents a comprehensive exploration of leveraging Large Language Models (LLMs), specifically GPT-4, in the field of instructional design. With a focus on scaling evidence-based instructional design expertise, our research aims…

Computation and Language · Computer Science 2023-06-27 Gautam Yadav

Effective pre-training of large language models (LLMs) has been challenging due to the immense resource demands and the complexity of the technical processes involved. This paper presents a detailed technical report on YuLan-Mini, a highly…

Computation and Language · Computer Science 2024-12-25 Yiwen Hu , Huatong Song , Jia Deng , Jiapeng Wang , Jie Chen , Kun Zhou , Yutao Zhu , Jinhao Jiang , Zican Dong , Wayne Xin Zhao , Ji-Rong Wen

Generative Pre-trained Transformer 4 (GPT-4) is the fourth-generation language model in the GPT series, developed by OpenAI, which promises significant advancements in the field of natural language processing (NLP). In this research…

Computation and Language · Computer Science 2023-05-08 Jawid Ahmad Baktash , Mursal Dawodi

Large-scale generative language models such as GPT-3 are competitive few-shot learners. While these models are known to be able to jointly represent many different languages, their training data is dominated by English, potentially limiting…

This paper studies recent developments in large language models' (LLM) abilities to pass assessments in introductory and intermediate Python programming courses at the postsecondary level. The emergence of ChatGPT resulted in heated debates…

Computers and Society · Computer Science 2023-10-05 Jaromir Savelka , Arav Agarwal , Marshall An , Chris Bogart , Majd Sakr

With the rise of online education platforms, there is a growing abundance of educational content across various domain. It can be difficult to navigate the numerous available resources to find the most suitable training, especially in…

Computation and Language · Computer Science 2024-12-03 Batuhan Sariturk , Rabia Bayraktar , Merve Elmas Erdem
‹ Prev 1 2 3 10 Next ›