OxTalks will soon move to the new Halo platform and will become 'Oxford Events.' There will be a need for an OxTalks freeze. This was previously planned for Friday 14th November – a new date will be shared as soon as it is available (full details will be available on the Staff Gateway).
In the meantime, the OxTalks site will remain active and events will continue to be published.
If staff have any questions about the Oxford Events launch, please contact halo@digital.ox.ac.uk
In healthcare, differences observed between demographic groups can generally be categorised as biological or non-biological. Non-biological differences, such as visit frequency and reporting style, are more challenging to track and can unexpectedly influence the predictive bias of machine learning algorithms. This is particularly true when dealing with complex free-text data in the mental health domain. In this talk, we will present our framework for analysing text-related bias in Natural Language Processing (NLP) models, developed for the paediatric anxiety use case with a focus on sex demographic subgroups. Our framework first measures model bias and then investigates the origins of this bias in statistical word distributions and the generalisation capacities of NLP algorithms. Motivated by these findings, we propose a data-centric bias mitigation strategy based on sentence informativeness filtering and masking gender-related words. Our approach demonstrated a bias reduction of up to 27%, improving classification parity between sex demographic groups while maintaining overall performance.