The era of AI empowering everything has arrived. It is expected that by 2025, the number of AI-enabled devices will grow to 75 million and will impact every industry and job. Each type of AI application may have its own development ecosystem and architecture. For enterprises, this model means high maintenance costs and high security risks, which limits the pace of applying new technologies and innovations in enterprise AI. The LF AI & DATA open source community has been committed to building a globalized ecosystem for AI technology and developers. We sincerely invite all open source AI users, contributors, and community members to participate in this gathering to explore the future of open source AI together.
Currently, the large-scale language model revolution ignited by ChatGPT is having a profound impact. As one of the scarcest resources in the intelligent era, the importance of data is beyond doubt and often becomes a bottleneck for model development and tuning in major enterprises and research institutions. This topic focuses on discussing the data pain points behind large models and future-oriented solutions.
Junping Du | Chairman of LF AI & DATA Foundation
Application of Large-scale Language Models in Intelligent Document QA: A Solution Based on Langchain and Langchain-serve.
The task of a document question-answering system is to search for answers related to user questions from document data. As the number of documents continues to increase, traditional search methods can no longer meet people's needs. With the development of deep learning models, document question-answering systems have migrated from character matching-based methods to vector representation-based methods. However, they still can only return paragraphs relevant to the question and cannot directly provide answers, especially for yes/no questions. Recently, the ability of large-scale language models has been continuously improving, providing a solution for generating answers in document question-answering systems. The next generation of document question-answering systems will integrate traditional models, deep learning question-answering models and large-scale language model technologies together to provide users with more comprehensive document question-answering services. This presentation will introduce how to use Langchain development framework and Langchain-Serve deployment tool to develop intelligent document question-answering systems.
Artificial intelligence has gradually moved from "refining models" to "refining large models". Compared with traditional models trained for specific application scenarios, large models have strong generalization ability and are no longer limited to a single specific scenario. Therefore, they require larger and broader data input and stronger computing power for training. All these require huge costs that most developers cannot afford. How to reduce the threshold for training and applying large models has become a new challenge. In this topic, we will share the practical experience of using MindSpore's one-stop easy-to-use large model platform, which integrates model selection, online inference, and online training. It supports online experience and fine-tuning of large models so that developers can get close contact with applications such as generating text from text; generating images from text; remote sensing detection based on big models.
AI has become an indispensable part of the computer infrastructure, and databases optimized for AI scenarios have emerged. AI databases not only need to meet the requirements of feature engineering and machine learning model deployment in terms of functionality, but also have higher requirements for offline and online performance. This sharing will take the OpenMLDB project as an example to introduce in-depth the application scenarios and performance optimization of AI databases, achieving rapid implementation of specific AI scenarios and several times or even dozens of times performance improvement.
Dihao Chen | Chen is the platform architect of 4Paradigm
During the era of AIGC, vector databases are playing an increasingly important role in processing massive unstructured data. This sharing will focus on how vector databases empower AI in the wave of AIGC.
Jerry Li | Head of the OSChina Community and Operation Director of the Linux Foundation's Open Source Software Academy
PyTorch 2.0: the journey of bringing compiler technologies to the core of PyTorch
PyTorch 2.0 uses compilers to deliver faster training and inference without sacrificing the flexibility and ease of use PyTorch. This talk will provide an overview of the technology stack behind the new torch.compile() API, discussing the key features of PyTorch 2.0, including its full backward compatibility and 43% faster model training. We will introduce various stack components, such as TorchDynamo, AOTAutograd, PrimTorch, and TorchInductor, and how they work together to streamline the model development process. Attendees will gain a deeper understanding of the PyTorch 2.0 architecture and the benefits of incorporating compiler technologies into deeper learning frameworks.
Peng Wu | Engineering Manager
Deep learning platform + large models to solidify the foundation of industrial intelligence
This speech combines the latest trends in generative AI and Baidu's practice, introducing the progress of Baidu's deep learning platform + large model core technology research and development, product innovation, and ecological construction. The speech also shares thoughts on the development of an industrial-grade open-source platform for deep learning based on PaddlePaddle and the integration of industry and education to build an ecological system under new trends.
Jun Zhang | Product Manager of Baidu's PaddlePaddle framework member of OpenAtom TOC
Federated learning enables the collaborative training of a model by multiple data sources without the need to share their data. In recent years, large language models based on transformers have become increasingly popular. However, these models present challenges due to their high computational resource requirements and complex algorithms. In this presentation, we will introduce FATE’s latest efforts in applying federated learning to large language models such as GPT-J, ChatGLM-6B, GLM, and LLaMA in financial use cases. FATE combines the distributed training mechanism of federated learning with large models to keep sensitive data from all parties within their local domains while allowing for computational investment based on each party’s actual data volume. This enables joint training of large models and mutual benefit. We will also discuss technical and practical considerations, real-world use cases, and the need for privacy-preserving mechanisms.
Lin Peng | Senior Research Fellow of VMware CTO Office
Model inference optimization, exploring the potential of AI implementation
The trend of large models is unstoppable, and how to improve model inference efficiency has become an urgent problem. This sharing will introduce the current status and trends of model inference optimization technology, and share Adlik's practice in this field.
Liya Yuan | Standard and Open Source Senior Engineer of ZTE
Xtreme1 is the next-generation multimodal open-source training data platform
UBS Global research report found that 70%-90% of AI engineers' time is spent on training data. Many algorithms are already very good in practice, and data has become a new bottleneck for developing AI models. Based on this situation, the BasicFinder team developed the Xtreme1 training data platform, dedicated to building the easiest-to-reach open-source Data-Centric MLOps infrastructure to connect people, models and data. Xtreme1 is the world's first open-source tool that supports multi-modal data annotation and introduces ontology to penetrate different AI clients' problem abstractions. It fully follows cloud-native architecture principles to ensure service performance scalability, deployment flexibility, and service resilience in case of failures.
Jiajun Wang | CTO of BasicFinder
OPPO's exploration and practice in the field of mobile graphics technology - O3DE Mobile WG and shaderNN
In recent years, with the continuous improvement of mobile computing power and the rapid development of deep learning research, especially the increasing demand for data security and the maturity of small network models, more and more inference that was originally executed in the cloud has been transferred to mobile devices. The deep learning inference on mobile platforms involves hardware platforms, drivers, compilation optimization, model compression, operator algorithm optimization and deployment. Efficient inference frameworks suitable for system business development have become an urgent need and development focus in the industry.
Based on efficient AI inference requirements for graphic image post-processing on mobile devices to reduce business integration costs and improve efficiency, we have developed ShaderNN - an efficient inference engine based on GPU shader. It directly performs efficient inference based on GPU textures to save I/O time without relying on third-party libraries. It is compatible across different hardware platforms, supports mainstream deep learning training frameworks, convenient for optimization, integration, deployment and upgrade.
Zhouhu Peng | Head of OPPO's OSPO
Next-generation knowledge tools: user-centered personalized language models and hybrid deployment strategies.
Cheng Chang | Head of YINXIANG Research Institute
Intel’s Journey with PyTorch: Democratizing AI with ubiquitous hardware and open software
PyTorch is one of the most popular frameworks for deep learning and machine learning. Intel has been a long-term contributor and evangelist in PyTorch community. In this talk, we will share our experiences in contributing to PyTorch, both in the core framework and in its ecosystem libraries. We will elaborate our optimizations in torch.compile(), the flagship new feature of PyTorch 2.0, and showcase its benefit on CPUs. We will demonstrate the value of open software and ubiquitous hardware by showcasing generative AI applications powered by diffusion and large language models running with PyTorch on Intel CPUs and GPUs. We will also touch base on some of the PyTorch ecosystem projects that we contributed to, such as HuggingFace, DeepSpeed, PyG etc. Finally, we will discuss our future plans and vision for continuing our partnership with the PyTorch Foundation and advancing the state-of-the-art in deep learning and machine learning.
Mingfei Ma | Senior Deep Learning Software Engineer
DeepRec: High-performance deep learning framework for recommendation scenarios
DeepRec is a high-performance deep learning framework for recommendation scenarios, open-sourced by Alibaba Cloud's machine learning platform PAI. It has deeply optimized the performance of sparse models in distributed computing, graph optimization, operators, runtime and other aspects. At the same time, it provides a series of functions such as dynamic elastic features, dynamic elastic dimensions, adaptive EmbeddingVariable, incremental model export and loading. DeepRec is applied internally in Alibaba Group's core businesses such as Taobao, Tmall, Ali Mama, AMap, ITao，AliExpress and Lazada to support large-scale sparse training with billions of features and trillions of samples. Since its open-source release over a year ago, DeepRec has been widely used in search promotion business scenarios by dozens of companies, bringing significant business value.
Chen Ding | Technical Expert of Alibaba Cloud PAI
Building a production ecosystem around MegEngine's algorithms.
Currently, the application of AI technology has been validated in various fields and it has higher productivity than traditional algorithms. However, with the increasing demand for a large number of AI algorithms, the traditional algorithm generation method that focuses on specific scenarios for data collection, annotation, model training, validation and delivery has become a bottleneck for AI implementation. The MegEngine team proposes a standardized algorithm production method based on each stage around the MegEngine training framework to reduce the threshold for AI implementation. In order to achieve algorithm production, MegEngine has developed a series of components that together form the ecosystem of MegEngine's algorithm production and are gradually being open-sourced.
Qiyou Chen | Head of MegEngine team at Megvii
Primus - Universal Distributed Training Scheduling Framework
In recent years, machine learning technology has been deeply rooted in various application fields and has successfully brought significant improvements. In order to meet the increasing amount of training data and model size, the concept of distributed training has emerged for more efficient model training. As a general-purpose distributed training scheduling framework, Primus provides a universal interface that bridges distributed training tasks and physical computing resources, allowing data scientists to focus on designing learning algorithms. It also allows dispersed training tasks to run on different types of computing clusters such as Kubernetes and YARN. Based on this foundation, Primus also provides fault tolerance capabilities and data scheduling capabilities required for distributed training tasks, further enhancing the usability of distributed training.
1. Overview of Distributed Training
2. Structure and Functionality of Primus
a. Data Scheduling Capability
3.Current Status and Future Development Plans for Primus
[You Will Gain]
Insights into ByteDance's current status and practices with Primus
Related challenges in the field of distributed training along with future prospects
Hebang Xu | Infrastructure Computing Framework R&D Engineer of ByteDance
Boost ML Upstream Frameworks with transparent backend graph compilers seamlessly
As an emerging trend being observed from cloud to edge, AI workloads tend to be managed and orchestrated successfully by the top ML frameworks like Ray. But at the same time, AI accelerations have been enabled by diverse vendors' AI accelerators such as Nvidia GPU series, Intel Movidius VPU, GoogleTPU, etc. Actually you can see many ASIC-based AI accelerators. On the other hand, a variety of graph compilers like TVM, Intel OpenVINO, TensorRT, etc are existing to improve ML performance but fragmented. So, users have the challenges around empowering these heterogeneous AI accelerators with different software accelerations in the real world due to missing a general unified framework supporting them naturally. Here we'd like to review if-how we introduce our transparent backend acceleration technologies to boost ML performance automatically on heterogeneous AI accelerators with those ML graph compilers mainstream seamlessly on popular ML upstream frameworks such as Tensorflow, Pytorch, TorchServe, Tensorflow Serving, etc. With our zero code change approach to mainstream ML frameworks, users can see their ML/AI performance boosted on their original AI application.
Tiejun Chen | Sr. Technical lead
Challenges and Attempts in Developing Multi-modal AI Applications
Compared to traditional single-modal AI applications, there are still many technical issues that need to be solved in the development of multi-modal AI applications. In this context, Jina AI explores this new application scenario and technological challenges in depth, providing developers with a one-stop MLOps platform. Jina AI empowers all developers to implement super cool multi-modal AI ideas.