Understanding the evolution of job requirements is becoming more important for workers, companies and public organizations to follow the fast transformation of the employment market. Fortunately, recent natural language processing (NLP) approaches allow for the development of methods to automatically extract information from job ads and recognize skills more precisely. However, these efficient approaches need a large amount of annotated data from the studied domain which is difficult to access, mainly due to intellectual property. This article proposes a new public dataset, FIJO, containing insurance job offers, including many soft skill annotations. To understand the potential of this dataset, we detail some characteristics and some limitations. Then, we present the results of skill detection algorithms using a named entity recognition approach and show that transformers-based models have good token-wise performances on this dataset. Lastly, we analyze some errors made by our best model to emphasize the difficulties that may arise when applying NLP approaches.

Read the conference proceedings

Keywords: Soft Skill Detection, NLP, French Supervised Corpus, Machine Learning


The development and use of artificial intelligence is a central issue in many countries. Steve Jacob, professor, Department of Political Science and Research Chair in Public Administration in the Digital Age at Université Laval, and Justin Lawarée, public affairs advisor at the International Observatory on the Societal Impacts of AI and Digital Technology (OBVIA), analyzed the content of government strategies on AI from an international perspective. Nearly 500 public measures were identified in the government strategies of Canada, the United States, France, the United Kingdom and the European Union.

The categories of public measures most represented in the 28 strategies analyzed are those concerning the development of organizational capabilities as well as those mobilizing economic means to develop AI. The importance of economic measures can be explained by the importance of government strategies focused on industrial applications and the need for talent in this field of activity.

The ethical dimension of AI development is present in some of the analyzed papers. However, given the importance of the societal and ethical impacts of AI and digital technologies, ethical issues are not systematically highlighted in the government strategies studied.


Ce jeu de données a été recueilli dans le cadre du projet multidisciplinaire Femmes face aux défis de la transformation numérique : une étude de cas dans le secteur des assurances de l’Université Laval, financé par le Centre des compétences futures. Il regroupe des offres d’emploi, en français, de compagnies d’assurance canadiennes entre 2009 et 2020.

View the dataset

Keywords: Offres d’emploi, Compétences