From Job Descriptions to Occupations: Using Neural Language Models to Code Job Data

OxTalks is Changing

OxTalks will soon move to the new Halo platform and will become 'Oxford Events.' There will be a need for an OxTalks freeze. This was previously planned for Friday 14th November – a new date will be shared as soon as it is available (full details will be available on the Staff Gateway).

In the meantime, the OxTalks site will remain active and events will continue to be published.

If staff have any questions about the Oxford Events launch, please contact halo@digital.ox.ac.uk

From Job Descriptions to Occupations: Using Neural Language Models to Code Job Data

See more information at https://metrics-and-models.github.io/!

Occupation is a fundamental concept in social and policy research, but classifying job descriptions into occupational categories can be challenging and susceptible to errors. Traditionally, this involved expert manual coding, translating detailed, often ambiguous job descriptions to standardized categories, a process both laborious and costly. However, recent advances in computational techniques offer efficient automated coding alternatives. Existing autocoding tools, including the O*NET-SOC AutoCoder, the NIOCCS AutoCoder, and the SOCcer AutoCoder, rely on supervised machine learning methods and string-matching algorithms. Yet these autocoders are not designed to understand semantic meanings in occupational write-in text. We explore the use of Large Language Models (LLMs) for classifying jobs into Standard Census occupations. We evaluate and compare the prediction performance of LLMs using four different approaches: zero-shot learning, few-shot learning, chain-of-thought, and fine-tuning. The results show a wide range of autocoding accuracy rates, varying from 7.1% to 78%. Drawing from Census expert coding practices, we provide practical recommendations for using LLMs in occupational classification for sociological research. We demonstrate LLM applications for coding resume data, processing survey occupational write-ins, and converting international occupational classifications to U.S. standards.

Date: 3 December 2025, 14:00
Venue: Kindly register to our mailing list to receive the Teams invitation!
Speaker: Professor Xi Song (University of Pennsylvania)
Organising department: Nuffield Department of Population Health
Organiser: Metrics and Models Management Group (University of Oxford)
Booking required?: Not required
Booking url: https://metrics-and-models.github.io/
Booking email: metrics_and_models-subscribe@maillist.ox.ac.uk
Audience: Public
This talk features in the following public collections:
- Talks of Interest to Medical Sciences
Editor: Richard Rahal