Projects 研究プロジェクト
複合AIによる問題解決手法 Problem-solving methods using multiple AI models
- 研究リーダー Project Leader
東京大学 素粒子物理国際研究センター International Center for Elementary Particle Physics, The University of Tokyo
- 研究担当者 Researcher
齊藤 真彦 助教 Masahiko Saito Assistant Professor
森永 真央 特任助教 Masahiro Morinaga Project Assistant Professor
GANGULY Sanmay 特任助教 Sanmay Ganguly Project Assistant Professor
AIを便利な「ツール」からより「知能」らしく Making AI more “intelligent” beyond a useful “tool”

Although research using AI has been advancing in the field of particle physics experiments, AI is only applied to one problem at a time and is regarded as a useful “tool” rather than a form of “intelligence.” Our ultimate research theme is to take this situation one step further and pursue how AI can behave more like intelligence. Starting with the problems of particle physics experiments, we will take on a challenge to incorporate multi-stage problems, for which many experimental particle physicists including ourselves have been working together, into a framework of machine learning with expecting better results than before. Through this research, we will explore the potential of AI that looks like “intelligence” rather than “tool.”
Details of Project
問題解決に向けて調整・統括の役割をAIに取り入れる Integrating the roles of adjustment and supervision into AI to solve problems
Although problems we encounter seem to be simple at first glance, they are often complex and multi-staged. Since an actual problem involves subproblems, the final solution may be reached only after all the subproblems have been solved. To tackle such a problem in our society, we improve efficiency to solve issues and derive high-quality solutions by assigning people who can understand the whole picture of the problem, and adjust and control its subproblems. Incorporating such a mechanism into AI is expected to improve the efficiency of AI and achieve results that conventional AI could not.

【1】AIをAIが学習する「Multi-AI」の開発 [1] Developing “Multi-AI” where AI learns from AIs
As an AI framework for coordinating individual subproblems with controlling the whole process, we have been working on the research and development of “Multi-AI,” in which the control AI learns from a group of AIs (i.e., a set of machine learning models) configured for each function and then selects an appropriate AI from them for each subproblem. So far, we have developed a framework for connecting, optimizing and selecting multi-step machine learning models. In the future, we will conduct advanced research on parameter optimization to handle complex loss functions in the multi-step processing. We will also investigate the feasibility of getting information that can be understood by humans from intermediate data between machine learning models, and solving problems by reusing trained machine learning models (e.g., multi-staged transfer learning).
【2】Multi-AIの応用可能性を実証する [2] Demonstrating the feasibility of applying Multi-AI
Concurrently, we will apply this Multi-AI framework to the data of particle physics experiments conducted at the International Center for Elementary Particle Physics, the University of Tokyo. In our experiments, we handle big data of several hundred petabytes (1 petabyte is 1000 trillion bytes). To discover new subatomic particles, etc., from such big data, we need to solve multi-stage problems, as shown in the discovery of the Higgs boson in 2012. If working effectively, Multi-AI will not only advance research in the field of particle physics but also demonstrate its applicability. Furthermore, we aim to demonstrate the effectiveness of the framework developed in this project by applying it to problems in general society other than particle physics.
Values / Hopes
AIを理解する Understand AI
Considering two types of AI: one that solves a large problem as a single end-to-end machine learning model and the other that solves subproblems by subdividing a problem, we think that the latter is better in terms of human interpretability. Despite the (slight) performance loss that comes with subdividing and multi-staging, if interpretability can be ensured, such an AI could be used in order to solve problems that require explanation for their solutions in both natural science, such as particle physics experiments, and the real world.
Research outcome
複合AIの第一歩として、複数の機械学習モデルの接続・選択ができるフレームワークの開発を行いました。ベースとなる技術は、機械学習モデルのネットワークアーキテクチャを最適化するNeural Architecture Search(NAS)で、本研究で開発したMultiMLフレームワークでは、3つの接続・選択手法:DARTS, SPOS-NAS, ASNG-NASを実装しました。
並行して、素粒子実験のデータ解析に関する最新の機械学習技術の応用研究を実施しました。転移学習による少ない学習データでの事象識別、ファインマン図とグラフネットワークを組み合わせた事象識別、Normalizing flowを導入した異常検知を用いた新粒子探索、Vision Transformerを用いたヒッグス粒子や大半径ジェット識別、グラフネットワークを用いたトップ粒子識別、Diffusion modelを用いたハドロン化モデリングなどの研究を行いました。これらは個別の問題に特化した機械学習であり、将来、MultiMLフレームワークを用いた素粒子実験データ解析の部品として利用することが可能になります。
As the first step of our “Multi-AI”, we developed a framework that can connect and select multiple machine learning models. The base technology is Neural Architecture Search (NAS), which optimizes the network architecture of machine learning models, and our framework called “MultiML” developed in this research has three connection and selection methods: DARTS, SPOS-NAS, and ASNG-NAS.
Using experimental particle physics simulation data, we developed this framework by benchmarking the problem of identifying Higgs and Z bosons that decay into tau pairs. In this problem, two tasks are defined and connected. The former task is learning to calculate the momentum of the tau particle, and the latter task is learning to identify whether the tau particle pair is produced from the Higgs particle or the Z boson particle. The latter task uses the result of the former task as input and performs discrimination. First, we prepare some machine-learning models for each task. Then, the MultiML framework selects the optimal set from multiple combinations while learning individual models. This framework can be used simply for hyperparameter tuning, but it can also handle completely different machine-learning models as candidates.
As a problem, there is information loss due to data transfer at the task connection. To solve this problem, the output of the previous task, in our benchmark case, the momentum is not passed directly to the subsequent task, but latent-space parameters are passed. The previous task is learned by adding another shallow network with latent parameters as input. This minimized information loss to subsequent tasks and restored overall performance. By using this framework, machine learning models developed for specific problems can be easily incorporated and optimized as part of a larger problem.
We also conducted applied research on the latest machine learning technology for data analysis of elementary particle experiments: event identification with a small amount of training data by transfer learning, event identification by combining Feynman diagrams and graph networks, new particle search in anomaly detection using normalizing flow, Higgs and large radius jet identification using Vision Transformer, top-quark identification using graph network, and modeling of jet hadronization using a diffusion model. These are machine learning for specific problems, and in the future, it will be possible to use them as components of our data analysis using the MultiML framework.
