FedBFPT: An Efficient Federated Learning Framework for Bert Further Pre-training

FedBFPT: An Efficient Federated Learning Framework for Bert Further Pre-training

Xin'ao Wang, Huan Li, Ke Chen, Lidan Shou

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 4344-4352. https://doi.org/10.24963/ijcai.2023/483

This study proposes FEDBFPT (Federated BERT Further Pre-Training), a Federated Learning (FL) framework for further pre-training the BERT language model in specialized domains while addressing privacy concerns. FEDBFPT enables multiple clients to collaboratively train the shallower layers of BERT, which are crucial in the pre-training stage, without the need to share private data. To achieve this, FEDBFPT involves building a local model for each client, progressively training the shallower layers of local models while sampling deeper layers, and aggregating trained parameters on a server to create the final global model. This approach utilizes multiple smaller local models to further pre-train a global model targeted at specific tasks via fine-tuning, resulting in a reduction in resource usage while maintaining model accuracy. Theoretical analysis is conducted to support the efficiency of FEDBFPT, and experiments are conducted on corpora across domains such as medicine, biology, and computer science. Results indicate that FEDBFPT achieves performance levels comparable to traditional FL methods while reducing computation and communication costs by 46.70% and 7.04%, respectively, even approaching the performance of centralized training models. The Source code is released at https://github.com/Hanzhouu/FedBFPT.
Keywords:
Machine Learning: ML: Federated learning
Natural Language Processing: NLP: Applications