A comparative analysis with machine learning of public data governance and AI policies in the European Union, United States, and China


  • bisson christophe skema
  • Adele Giron skema
  • gauthier Verin skema




public data governance, artificial intelligence policy, text mining


This paper explores the public data governance and AI policies in the world’s three main technological regions which are the United States, China, and European Union based on scientific literature analysis with machine learning. We used the RapidMiner text mining algorithm to classify texts and define the recuring themes in each region through Terms Frequency-Inverse Document Frequency, supervised machine learning techniques with KNN, and Naïve Bayes. Therein, our results reveal the most influential items for each region that emphasize three different approaches in China, the United States and the EU.


Aaronson, S.A. 2019. Data is different, and that’s why the world needs a new approach to governing cross-border data flows. Digital Policy, Regulation and Governance 21(5): 441–460.

Alemanno, A. 2018. Big Data for Good: Unlocking Privately-Held Data to the Benefit of the Many. European Journal of Risk Regulation 9(2): 183–191.

Alhassan, I., Sammon, D. and Daly, M. 2018. Data governance activities: a comparison between scientific and practice-oriented literature. Journal of Enterprise Information Management 31(2): 300–316.

Bendiek, A. and Römer, M. 2019. Externalizing Europe: the global effects of European data protection. Digital Policy, Regulation and Governance 21(1): 32–43.

Bisson, C. 2013. Guide de gestion strategique de l’information pour les PME. Montmoreau : Les 2Encres.

Brincourt, L. 2021. Géopolitique de la Datasphère. Le “Cloud Act”, trois ans après : révélateur du besoin de définition de notre souveraineté dans l’espace numérique. Accessed November 17, 2022. https://www.diploweb.com/Le-Cloud-Act-trois-ans-apres-revelateur-du-besoin-de-definition-de-notre-souverainete-dans-l-espace.html

Calzada, I. and Almirall, E. 2020. Data ecosystems for protecting European citizens’ digital rights. Transforming Government: People, Process and Policy 14(2): 133–147.

China Daily Newspaper. 2017. 习近平:实施国家大数据战略加快建设数字中国. Accessed November 21, 2022. http://www.chinadaily.com.cn/interface/flipboard/1142846/2017-12-12/cd_35280418.html.

Christian, H., Pramodana Agus, M and Suhartono, D. 2016. Single Document Automatic Text Summarization using Term Frequency-Inverse Document Frequency (TF-IDF). ComTech: Computer, Mathematics and Engineering Applications 7(4): 285-294.

Data Governance Institute 2022. Definitions of Data Governance. Accessed November 21,2022. https://datagovernance.com/the-data-governance-basics/definitions-of-data-governance/

Duboc, S. and Noël, D.-J. 2021. Economie et gouvernance de la donnée. Accessed November 17, 2022. https://www.lecese.fr/sites/default/files/pdf/Avis/2021/2021_06_eco_gouv_donnee.pdf

EU Parliament. 2023. EU AI Act: first regulation on artificial intelligence. Accessed June 25, 2023. EU AI Act: first regulation on artificial intelligence | News | European Parliament (europa.eu)

Fabiano, N. 2019. Ethics and the Protection of Personal Data. Journal of Systemics Cybernetics and Informatics 17(2): 58–64.

Gleeson, N. and Walden, I. 2016. Placing the state in the cloud: Issues of data governance and public procurement. Computer Law & Security Review 32(5): 683–695.

Gordon-Murnane, L. 2018. Ethical, Explainable Artificial Intelligence - Bias and Principles. Online Searcher 42(2): 22–44.

Herian, R. 2018. Regulating Disruption: Blockchain, GDPR, and Questions of Data Sovereignty. Social Science Research Network 22(2): 7–16.

Hlávka, J.P. 2020. Security, privacy, and information-sharing aspects of healthcare artificial intelligence. In Artificial Intelligence in Healthcare, ed. A. Bohr and K. Memarzadeh, 235–270. Amsterdam, Netherlands: Elsevier.

van Klyton, A. Arrieta-Paredes, M.-P., Palladino, N. and Soomaree, A. 2023. Hegemonic practices in multistakeholder Internet governance: Participatory evangelism, quiet politics, and glorification of status quo at ICANN meetings. The Information Society 39 (3): 141-157.

Kostka, G. 2019. China’s social credit systems and public opinion: Explaining high levels of approval. New Media & Society 21(7): 1565–1593.

Kshetri, N. 2014. Big data׳s impact on privacy, security and consumer welfare. Telecommunications Policy,38(11): 1134–1145.

Kuziemski, M. and Misuraca, G. 2020. AI governance in the public sector: Three tales from the frontiers of automated decision-making in democratic settings. Telecommunications Policy 44(6): 101976.

Lee, K. 2018. AI Superpowers: China, Silicon Valley, And The New World Order. Houghton: Mifflin Harcourt Company.

Liu, L. 2021. The Rise of Data Politics: Digital China and the World. Studies in Comparative International Development 56(1): 45–67.

Matthews, K. 2019. How AI is slowly changing data governance. Information Management 27 November.

Mazurek, G. and Małagocka, K. 2019. Perception of privacy and data protection in the context of the development of artificial intelligence. Journal of Management Analytics 6(4): 344–364. McLuhan, M., & (1967). The medium is the massage: An inventory of effects.

McLuhan, M. and Fiore, Q. 1967. The Medium is the Massage: An Inventory of Effects. Berkeley, CA: Gingko Press.

Meyer, D. 2017. Vladimir Putin Says Whoever Leads in Artificial Intelligence Will Rule the World. Accessed Oct 20, 2022. https://fortune.com/2017/09/04/ai-artificial-intelligence-putin-rule-world/,

Mureddu, F., Schmeling, J. and Kanellou, E. 2020. Research challenges for the use of big data in policy-making. Transforming Government: People, Process and Policy 14(4): 593–604.

Okuyucu, A. and Yavuz, N. 2020. Big data maturity models for the public sector: a review of state and organizational level models. Transforming Government: People, Process and Policy 14(4): 681–699.

Panagiotopoulos, A. 2019. Data protection law and ethics: Where do we stand? Information and records management society (212): 8–11.

Prasad, R., Agrawal, R. and Sharma, H. 2022. Modified Gabor Filter with Enhanced Naïve Bayes Algorithm for Facial Expression Recognition in Image Processing. In Advances in Computational Intelligence and Communication Technology, Lecture Notes in Networks and Systems, ed Gao, XZ., Tiwari, S., Trivedi, M.C., Singh, P.K., Mishra, K.K., 371–383. Singapore: Springer.

Rosenbaum, S. 2010. Data Governance and Stewardship: Designing Data Stewardship Entities and Advancing Data Access. Health Services Research 45(5): 1442–1455.

Segal, A. 2021. When China Rules the Web: Technology in Service of the State. Foreign Affairs 24 November.

Seurre, X. 2020. L’intelligence artificielle, un enjeu stratégique pour la puissance chinoise. Accessed November 17, 2022. https://www.iris-france.org/notes/lintelligence-artificielle-un-enjeu-strategique-pour-la-puissance-chinoise/

Smith, D. 2019. AI in Data Governance Strategic Finance, 1 November. Accessed November 17, 2022. https://sfmagazine.com/post-entry/september-2019-ai-in-data-governance/

Teixeira, A. G., Mira da Silva, M. and Pereira, R. 2019. The critical success factors of GDPR implementation: a systematic literature review. Digital Policy, Regulation and Governance 21(4): 402–418.

Uddin, S., Haque, I., Lu, H., Moni, M.A. and Gide, E. 2022. Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Sci Rep 12, 6256.

The US Department of Justice. 2021. Cloud act resources. Accessed November 22, 2022. https://www.justice.gov/dag/cloudact

Wang Y, Li T, Liu M, Li C, Wang H. 2022. Study on token shuffling under incomplete information based on machine learning. International Journal of Intelligent systems 37(12): 1-23.

Winter, J.S. and Davidson, E. 2018. Big data governance of personal health information and challenges to contextual integrity. The Information Society,35(1): 36–51.

Woods, A.K. 2018. Litigating Data Sovereignty. Yale Law Journal 128(2): 328–406.

Zeng, J. (2020).Artificial intelligence and China’s authoritarian governance. International Affairs 96(6): 1441–1459.

Zheng, G. 2021. Trilemma and tripartition: The regulatory paradigms of cross-border personal data transfer in the EU, the U.S. and China. Computer Law &Amp; Security Review 43: 105610.