Researchers from the Australian National University (ANU)
and the Commonwealth Scientific and Industrial Research Organisation’s (CSIRO)
Data61 have found
a large hidden job market for PhD graduates by using machine learning to
scan tens of thousands of job ads.
set out to produce data and methods to help universities prepare graduates for
workplaces outside academia, and also help industry to recognise the value of
the research skills developed by graduates of PhD and MPhil programs.
The project successfully used Machine Learning to analyse
job ads in order to better understand Australian industry demand for highly
skilled researchers. Though further research and development work is required, the
machine developed in this project can be used to perform a longitudinal
examination of Australian industry response to the innovation agenda.
The project was funded by the Australian Government
Department of Industry, Innovation and Science, and the job-search company, SEEK
Limited, provided raw data.
This interdisciplinary project was based on a strategic
partnership across governmental, industrial, and academic sectors. The research
team had expertise in computer science, research education, linguistics and
The PhD was originally designed to train academics, but it
is now more common for PhD graduates to leave the higher education sector than
to go on to an academic career. The project report highlights this as a
positive development. Highly skilled researchers working in a wider variety of
industry sectors can drive future economic prosperity, as PhD students can be
the vehicles for enhanced collaboration and knowledge transfer between academia
But Australia stands out amongst most developed countries
for lack of interest displayed by non-academic employers’ towards PhD
graduates, the report notes. In this area, Australia is significantly behind
similar countries on the World Economic Forum competitiveness index.
The report says that PhD programs still tend to privilege
skills required for an academic career over those required by industry. From
the supply side, Australian universities must do more to prepare PhD graduates
for a wider range of workplaces. This will involve changes to the form and
content of PhD curriculum, in addition additional co-curricular opportunities. From
the demand side, there is a lack of awareness of what skills and capabilities
people develop during the PhD and perhaps a lack of trust in the qualification
as producing ‘work ready’ employees.
Actionable data insights
Government can help by setting policy and creating
incentives, but the data to inform this work was incomplete and not actionable.
Data from the Australian Bureau of Statistics (ABS) and the Graduate
Destination Survey (GDS) shows the uptake of graduates in the business
community, but they are both lagging indicators. Future-focused information was
required to target policy and education efforts appropriately.
In this project, the team addressed this need for complete,
actionable data with a ‘big data’ approach, by applying Machine Learning (ML)
and Natural Language Processing (NLP), sometimes referred to as ‘Text (Data)
Mining’, to analyse 29,693 authentic job ads.
The report describes a job ad as a pitch to a potential
applicant which outlines the type of work that is required and the type of
person that the employer would like to hire. If employers do not use “PhD” as a
keyword in a job ad, a simple keyword search will not retrieve all ads aimed at
This project sought to design a ML-based NLP algorithm that
could learn what a ‘PhD shaped job’ looks like; highlight these within a large,
complex dataset supplied by SEEK and enable interactive search and
visualisation of this information as a web demonstration system. In addition, the
team aimed to teach the machine to analyse the job ad and find out what skills
and capabilities were most important to employers.
The results and applications
The machine found a large ‘hidden job market’ for PhD
graduates in the Australian workforce.
Only 20.7 per cent of non-academic job ads (2,770 of 13,379
unique job titles) in the dataset asked for a PhD qualification, yet as many as
43 per cent (210 of 483) of the unique job ads that were analysed (as part of the hand coding carried out during this project by two content experts) required a
high level of research skills and capabilities, indicative of a PhD. Building on this, the machine read 29,693 job
advertisements, predicting 15,440 ads (52%), 10,689 ads
(36%), and 3,564 ads (11%) as having a High, Medium, and
Low Knowledge Intensity Bandwidth, respectively.
While this dataset does not encompass the full extent of the
Australian employment landscape (ads of this dataset appeared to skew towards
managerial and professional jobs) it does reflect a dramatic gulf in Australian
perceptions of the PhD.
The team is proposing that the machine could be refined and
used to track changes in industry demand for Australia’s higher degree research
qualified workforce over a five-year period in order to see if this approach is
useful as a benchmarking process.
Supplied with up to the minute job data sources, the machine
could provide real time details on demand for higher degree knowledge workers
usable in policy development. Policy makers could use the machine to perform
fine-grained analyses of demand for skilled researchers in a given industry
sector, target further analysis of the specific researcher skill sets required
by various industries; and measure the alignment between employer demand and
targeted funding initiatives in the area of research training, enabling more
forensic targeting of initiatives such as scholarships, internships, and
industry incentive packages.
The machine could enable targeted studies to inform policy
interventions to drive innovation, particularly in regional areas and
industries with currently low levels of demand for highly skilled workers.
Moreover, the methodology could be adapted to make machines
that will enable the analysis of prospects for other cohorts in higher
education, such as undergraduate and Masters degrees.
Using the machine with the current dataset, research
education experts and statisticians could produce data to inform evidence based
curriculum audits to explore how teaching and learning activities align with
current industry demand and identify emerging trends in demand for research
skills to inform course design.
One of the lead researchers, Dr. Will Grant from ANU, said, “We
taught the machine to analyse job ads and tell us what skills were most
important to employers. The problem is that industry employers in Australia –
particularly in manufacturing, transport, logistics, marketing and
communication – may not be aware that PhD graduates have the skill set they're
Co-researcher Adjunct Professor Hanna Suominen, a natural
language processing expert from CSIRO's Data61, co-invented the machine
learning algorithm and is optimistic about the potential of this tool to help
find work for PhD students.
“Our researchers will continue to develop the machine into a
web portal to support PhD graduates in their search for work. It has the
potential to connect PhD graduates with ideal jobs they may not have otherwise
come across or considered,” Dr. Suominen said.
Read the project report here.