The research interests of Sarah Alnegheimish reside at the intersection of automatic learning and systems engineering. Its objective: to make automatic learning systems more accessible, transparent and trustworthy.
Alnegheimish is a doctoral student in the main group of researcher from researcher Kalyan Veeramachaneni at the MIT Laboratory for Information and Decision Systems (LIDS). Here, it commits most of its energy to develop Orion, an automatic learning library and chronological series which is capable of detecting without supervision anomalies in industrial and operational industrial and operational parameters on a large scale.
Early influence
Daughter of a university professor and teacher, she learned from an early age that knowledge should be freely shared. “I think that growing up in a house where education was much appreciated, which is why I want to make the automatic learning tools accessible.” Alnegheimish's personal experience with open source resources does not increase its motivation. “I learned to consider accessibility as the key to adoption. To combat impact, new technologies must be accessible and evaluated by those who need it. This is all the purpose of doing open-source development. ”
Alnegheimish obtained his baccalaureate at King Saud University (KSU). “I was in the first cohort of IT majors. Before the creation of this program, the only other IT adult was available was IT (information technology). ” Being part of the first cohort was exciting, but he brought his own unique challenges. “All the teachers taught new elements. Success required independent learning experience. It was at this moment that I met the OpenCourseware MIT: as a resource to teach me. ”
Shortly after graduating, Alnegheimish became a researcher at King Abdulaziz City for Science and Technology (KACST), the National Laboratory of Saudi Arabia. Thanks to the Center for Complex Engineering Systems (CCES) in KACST and MIT, she began to carry out research with Veeramachaneni. When she applied for higher education, her research group was her first choice.
Create Orion
The master thesis of Alnegheimish has focused on the detection of chronological series anomalies – identifying behaviors or unexpected models in data, which can provide users from crucial information. For example, unusual models in network traffic data can be a sign of cybersecurity threats, abnormal sensor readings in heavy machines can predict potential future failures and vital signs of patients can help reduce health complications. It is thanks to the search for his mastery that Alnegheimish began to design Orion.
Orion uses statistical models and based on automatic learning that are permanently and maintained. Users do not need to be experts in automatic learning to use the code. They can analyze signals, compare the methods of detecting anomalies and study anomalies in an end -to -end program. The framework, code and data sets are all open.
“With open source, accessibility and transparency are directly made. You have an access without restriction to the code, where you can study how the model works by including the code. We have increased transparency with Orion: we label each step of the model and present it to the user. ” Alnegheimish says that this transparency allows users to start trusting the model before seeing for themselves how reliable it is.
“We are trying to take all these automatic learning algorithms and put them in one place so that anyone can use our standard models,” she said. “It is not only for the sponsors with which we work at the MIT. It is used by many public users. They come to the library, install it and perform it on their data.
Reuse of anomalies detection models
In his doctorate, Alnegheimish explores more innovative means to detect anomalies using Orion. “When I started my research, all automatic learning models had to be formed from zero on your data. Now we are at a time when we can use pre-formed models, ”she says. Working with pre-formed models saves time and calculation costs. The challenge, however, is that the detection of chronological series anomalies is a whole new task for them. “In their original sense, these models have been trained to foresee, but not to find anomalies,” explains Alnegheimish. “We reject their borders thanks to fast engineering, without any additional training.”
Because these models already capture the models of chronological series, Alnegheimish thinks they already have everything they need to allow them to detect anomalies. So far, its current results support this theory. They do not exceed the success rate of models which are trained independently on specific data, but she thinks they will one day.
Accessible design
Alnegheimish speaks at length about the efforts it has gone through to make Orion more accessible. “Before I came to MIT, I thought that the crucial part of research was to develop the automatic learning model itself or improve its current state. Over time, I realized that the only way to make your research accessible and adaptable to others is to develop systems that make them accessible. During my higher education, I adopted the approach to the development of my models and systems in tandem. ”
The key element in the development of your system was to find the right abstractions to work with its models. These abstractions provide a universal representation for all models with simplified components. “Any model will have a sequence of steps to go from the raw input to the desired exit. We have standardized the input and exit, which allows the middle to be flexible and fluid. So far, all the models that we have executed have been able to renovate in our abstractions.” The abstractions she uses has been stable and reliable for six years.
The value of construction systems and models simultaneously can be seen in the work of Alnegheimish as a mentor. She had the opportunity to work with two Master students who obtain their engineering diplomas. “All I showed them was the system itself and the documentation on how to use it. The two students were able to develop their own models with the abstractions that we comply with each other. He reaffirmed that we take the right way. ”
Alnegheimish also studied if a large language model (LLM) could be used as a mediator between users and a system. The LLM agent it implemented is able to connect to Orion without the users needing to know the small details of Orion operation. “Think of Chatgpt. You have no idea what the model is behind, but it is very accessible to everyone.” For its software, users only know two orders: adjust and detect. The adjustment allows users to form their model, while detection allows them to detect anomalies.
“The ultimate goal of what I tried to do is to make AI more accessible to everyone,” she said. So far, Orion has reached more than 120,000 downloads, and more than a thousand users have marked the repository as one of their favorites on Github. “Traditionally, you are used to measuring the impact of research by quotes and paper publications. You now get real -time adoption by open source. ”
