Author: Liao Canliang
Source: All Media Exploration, April 2024
guide reading
The media resources accumulated by mainstream media for decades are important assets of media organizations. With the further deepening of the integration of AI and media, the media library will change from "a little relationship" to "life-threatening" for the integration and development of mainstream media.
How to revitalize these assets, let the stock drive the increment, lay the foundation for accurate distribution of media content, N times of dissemination, secondary creation, online trading, copyright protection, etc., and contribute to "news+government service business"? In this issue, the third of a series of special seminars on "Platform Construction of Mainstream Media" is launched, and industry peers and academic experts are cordially invited to have a beneficial discussion on the construction of media assets in the era of intelligent media.
Media asset repository, namely media asset repository, involves all finished products and materials accumulated by the media in the process of content collection, production, distribution and operation.
At present, artificial intelligence (AI) technology is integrating thousands of industries, driving a new round of scientific and technological revolution and industrial transformation, and will also reshape the media pattern, communication methods and public opinion ecology. The mainstream media are actively exploring the application of AI in news gathering, production, distribution, reception and feedback, so as to create a new ecology of intelligent media and consolidate and expand the mainstream ideological public opinion in the new era.
Media resource library is the key to the development of media integration in the era of artificial intelligence.
At present, major media organizations have insufficient investment in the construction and development of media libraries. The construction of many media libraries only stays in the electronic stage of historical written reports, and the digitization process of pictures, audio and video storage is slow, and the innovative development of media libraries and the development of media integration are still in the initial stage. The reasons are as follows: firstly, the media library is not a necessity but an auxiliary for the main business of mainstream media news, which has not been paid attention to for a long time; Second, the construction and development of media resource bank need a lot of capital, technology and talent investment, and can not directly bring considerable economic benefits.
With the development of AI technology, this low-input situation is expected to be completely broken. Data is the fuel of AI and the cornerstone of intelligent development. In the process of news reporting and integration, the mainstream media has accumulated a large number of content products and material data, and collected a large number of government affairs data, service data and business data, which can effectively support AI learning and training and improve its intelligence level.
A study by EpochAI, a well-known AI research institution, predicts that with the development of AI technology, high-quality data will become scarce by 2026, and low-quality data will be exhausted from 2030 to 2050. Abroad, The New York Times and other media sued OpenAI, the development company of ChatGPT, a generative AI application, for "training generative artificial intelligence applications with published news works without authorization". CNN, Associated Press, Fox and Time also held many negotiations with OpenAI on the application of content authorization training AI. This shows from one side that media data is of high quality, rich and scarce, and its application scenarios and market are very broad. The media library established on the basis of these data is not only an important asset of media organizations, but also the key for media organizations to seize the development opportunity of AI, empower them to establish the operation mode of "news+government service business" and promote the in-depth development of media integration.
With the further deepening of the integration of AI and media, the media library will change from "a little relationship" to "life-threatening" for the integration and development of mainstream media.
(A) the media library is the basis of intelligent production and dissemination.
The media library will fully empower the intelligent production and dissemination of media, and promote the rapid development of media intelligence. At present, the AIGC (Artificial Intelligence Generated Content) application platform represented by ChatGPT is developing rapidly. It is generally believed that AIGC will become a new content production mode after professional production content (PGC) and user production content (UGC), which is widely used in various fields of content production and replaces some manual creation.
Media library is the foundation of media AIGC. AI news writing, AI painting, AI video generation, AI virtual scene generation and other AIGC applications are all inseparable from AI’s learning and training of media library data. AI audit, accurate dissemination of media reports, and accurate evaluation of communication effects are also inseparable from AI’s data mining and predictive analysis of media database data.
(B) the data of the media library determines the media intelligent service.
With the integration of AI technology, the type and quantity of data in the media library will determine the type and level of intelligent services provided by mainstream media. In the AI ? ? era, without data, it is difficult for mainstream media to have intelligent services, and its influence and competitiveness will be discounted.
For example, at present, many mainstream media have set up platforms to interact and reflect their demands on government affairs, including People’s Daily’s Leadership Message Board, Xinjiang Daily’s Pomegranate Cloud 12345 Asking for Politics, Sichuan Daily’s Asking for Politics in Sichuan, Hebei News Network’s Sunshine Politics, Hualong.com’s Chongqing Online Asking for Politics Platform, etc. The accumulated government affairs data and operational experience of these platforms will create the AI ability of mainstream media for government affairs services.
Exploration on the Application of Media Library in Artificial Intelligence
The rapid development of AIGC platform provides a direction for innovative development of media libraries. At present, the mainstream media’s exploration and development of media libraries in the AI field mainly includes the following directions.
(A) the mainstream value corpus
AI brain is both a high-tech brain and a brain of values. The AI platform has a position, and the content generated by AI is oriented, and its orientation is essentially determined by the data corpus and algorithm of AI learning. For example, ChatGPT is accused of being "full of western ideology and American political correctness". The root cause is that Silicon Valley and scientific and technological circles in the United States have always been the base camp of American values. Most of the data of ChatGPT training and learning come from western data, and the products of course bring their own western ideology.
In the process of promoting media integration, mainstream media should not blindly adopt and settle in the AIGC platform of commercial enterprises, but pay special attention to the corpus data of feeding AI and the guidance of model algorithm itself. According to the Comprehensive Capability Evaluation Report of AI Big Models released by People’s Data, there is still room for improvement in the performance of domestic mainstream big models in terms of content ecology. Some big models have evaded sensitive topics to varying degrees, and some answers are more emotional. This reflects from one side that the mainstream value corpus in the current market is scarce and cannot support the learning and training of large models. As the main force to consolidate and expand the mainstream public opinion and expand the influence of mainstream values, the mainstream media needs to innovate the media resource library and establish the mainstream value corpus, so as to further exert the value of guarding ideological security in the AI era.
For example, in view of the major, sensitive and difficult questions that big models can’t generally answer and can’t answer well, People’s Daily mobilized all employees and pooled resources from all sides to build a "mainstream value corpus" including basic corpus, key field corpus and sensitive question-and-answer corpus. At present, it has completed the construction of a question-and-answer corpus with 120,000 questions, a corpus with 16 key fields and a basic corpus exceeding 30 billion words, which has achieved the integration and docking with many domestic mainstream big models and greatly improved.
People’s Network "Mainstream Value Corpus"
(B) the industry application model
The development path of media innovation and integration can be developed from three levels: first, the integration within the media, that is, the integration of traditional media and emerging media; The second is the industry integration between media and media; The third is the integration of the media with all industries and industries, so that the media can grow in the deep integration with various industries.
Mainstream media have accumulated a large number of high-quality industry data through industry reports and industry content operations, which can be transformed into high-quality corpus data for industry vertical model training, which can provide data and technical support for developing industry vertical model and promoting industrial integration, and lay the foundation for promoting industry and industry integration in the next step.
For example, the "People’s Intelligent Media Model" developed by People’s Daily Online provides the application of popular science questions and answers on earthquake knowledge for the State Seismological Bureau. This application is based on the study of earthquake popular science books and related standard documents, which effectively improves the efficiency of knowledge popularization in earthquake basic knowledge, earthquake disaster prevention, earthquake emergency rescue and earthquake early warning and response.
(3) Content risk control application
In the AI era, information is ubiquitous, omnipresent and unused, and intelligent content risk control has a broad application scenario. The finished content reported by the media library is, to a certain extent, the embodiment of the ability and experience of the mainstream media to review and check the content. By learning and training these data through AI technology, we can develop content risk control applications and comprehensively extend the ideological control ability of mainstream media.
People’s Network "People’s Review" System
For example, People’s Review, an intelligent review platform for political content launched by People’s Daily, takes the People’s Daily media database as the core data, and builds a political knowledge database based on the exclusive resources of People’s Daily and the experience of senior editors. It has intelligent risk control modules such as political text review and visual content detection, and can realize online detection, text review, picture review, video review and self-defined thesaurus. At present, "People’s Review" has provided content check and inspection services for more than 300 customers. With the further development of AI model, "People’s Review" will help to review the data corpus, generated content and online courses of AI training.
(D) Intelligent manuscript creation and application
The massive government affairs data in the media library, such as current political news reports, leaders’ speeches, policy documents, official reports, etc., have laid the foundation for the intelligent creative ability of mainstream media in party and government manuscripts. In developing the application of intelligent manuscript creation, the media library has two irreplaceable advantages: first, the data comes from mainstream media reports, which ensures the political direction, value orientation and public opinion orientation of AIGC; Second, the writing logic and sentence order of the original data are suitable for the application scenarios of party and government organs, institutions and state-owned enterprises.
For example, the artificial intelligence writing secretary "Zuoyi" creation engine launched by People’s Daily and relying on the State Key Laboratory of Communication Content Cognition built by People’s Network learns data sets and media corpora that conform to China’s mainstream values through AI training, covering political, economic, cultural, social, ecological, party building, national defense, diplomatic and other key fields, ensuring the safety of generated content, and focusing on providing high-quality and safe intelligent creation services for party and government organs, institutions and state-owned enterprises. At present, Zuoyi has provided application services for many party and government organs and large state-owned enterprises.
Artificial intelligence writing secretary "writing easy"
(5) Artificial intelligence detection
With the further integration of AI technology with the content industry, AIGC has ushered in a new stage of development. The accompanying risk challenges such as content infringement, phishing, deep forgery and false information have aroused widespread concern. Media reports are an important source of data training and learning for AI content generation platform. Learning and training relevant data through AI technology and launching targeted intelligent detection products can play a role in protecting copyright and maintaining content security, and have broad market application prospects in content security and copyright protection.
AIGC-X, a deep synthesis content detection tool for people’s network
For example, the "AIGC-X" application launched by the State Key Laboratory of Communication Content Cognition, which is in charge of People’s Daily, can quickly distinguish between machine-generated content and artificially created content, and the accuracy of Chinese text detection has exceeded 90%. In the next step, AIGC-X will be expanded into a general intelligent recognition model for AI to generate text, images and even videos, contributing to the overall planning of AI security and development.
Further innovation and development of media resource bank
The deep integration of AI and media, reshaping the media pattern and public opinion ecology is the general trend. The mainstream media should advance the layout and innovate the practice in the development of media assets, seize the opportunity of AI development, empower the intelligent transformation of media, and promote the in-depth development of media integration.
(A) actively enrich the media library
The breadth of high-quality data in data media library determines the depth of innovation and development in AI field. In addition to traditional media historical reports and material data, the following data can be considered emphatically.
The first is AI data. At present, the integration of AI and media has been further deepened, and the production efficiency and quality of mainstream media content have been greatly improved. In the future, data generated by AI will explode, and many media contents will come from AI. Therefore, massive AI generated data can be added to the media library.
The second is industry data. In the process of exploring the establishment of the business model of "news+government service business", mainstream media should pay attention to the accumulation and mining of industry data, form various industry databases, and enhance the depth and breadth of media assets.
The third is Internet data. Through open cooperation, mainstream media can collect Internet-related data directionally and expand the data volume of media assets.
(B) to promote the construction of intelligent media resources.
The construction of media resource bank can’t just stay in the electronic stage of written reports. Words, charts, pictures, audio and video reports and materials should be digitized and intelligently tagged to realize digital storage, multimodal search and accurate management of massive data. At the same time, it is suggested that all kinds of data should be cleaned, refined and classified to form all kinds of professional databases, so as to prepare for the innovative development of media libraries.
(C) accelerate the innovative development of media libraries in the field of artificial intelligence
Media assets are high-quality and scarce data assets of media organizations, but if they are not creatively developed and used, they will still be "historical archives" and cannot reflect their data value. Therefore, it is suggested that the mainstream media should actively innovate and develop the media resource library, and through the introduction of technology, continue to empower media content production, intelligent communication and operational analysis.
In addition, we can also seek cooperation with foreign countries, share and open the media resource pool, and smooth the big cycle of data resources. It is necessary to advance the layout, try first, accumulate data and experience in use, improve the media resource library in use, and provide support for the intelligent transformation and integration of media.
(The author is a researcher at People’s Network Research Institute)
This article was published in the April 2024 issue of All-Media Exploration, with the original title of "Exploration and Suggestions on Innovative Development of Media Resource Library in the Age of Artificial Intelligence", and the references are omitted.