Eco‑responsible  images

Image compression reduces page weight and loading times.

Read more about it

Search in

FAQ: AI in research


 

What are the possible uses for researchers and students?

Artificial intelligence is revolutionising various aspects of academic research. One of its most pressing applications is the creation of syntheses of research domains. This task, which involves compiling, analysing and condensing a vast quantity of research work on a given subject, can be laborious and time-consuming for researchers. AI, with its ability to rapidly manipulate large volumes of data, could make this process much easier.  

Several applications, such as ChatPDF or Sharly IA, have been developed to browse and understand a wide range of research documents, such as journal articles, study reports, theses and other types of publication. They are then able to decipher key ideas, results and trends, and present this information in a concise and easily accessible way. This enables researchers to quickly get to grips with the current state of an area of research, without having to sift through and analyse the relevance of the documents themselves.

However, it is important to note that these tools still have difficulty grasping the context and processing the nuances and subtleties present in scientific documents. For example, these programmes could misinterpret or oversimplify a complex concept or omit an important nuance that seems obvious to the human reader. Furthermore, AI reproduces biases inherent in the documents on which it has been trained, which can affect the accuracy and precision of the syntheses produced.

One of the possible uses of artificial intelligence is the examination of existing gaps in various areas of research. The associated tools, which are capable of processing and analysing large sets of textual data, could prove to be very useful in identifying where further research is needed.

Artificial intelligence is a powerful tool that can be used in a wide range of fields.

Platforms such as Elicit or Consensus have the ability to review countless research articles in record time, extracting the conclusions, methods and main themes presented. They can then scan this information to identify trends, patterns and topics that have been relatively less explored. For example, an AI system could identify a question that has frequently been raised, but which has not been answered satisfactorily, or a research approach that, although questioned, has not been improved. This would enable it to highlight the gap in current knowledge and suggest directions for future investigations.

However, it is essential to bear in mind that AI has its limits in this context. It could, for example, prove difficult to grasp the nuances and subtleties inherent in academic research and, as a result, not be able to reliably assess the quality of studies or the relevance of conclusions. In addition, as these methods rely on pre-existing data for their analysis, they may not detect shortcomings attributable to biases or limitations in these data.

The adoption of artificial intelligence solutions for the editing of scientific papers offers attractive prospects for improving the efficiency and productivity of researchers.

In terms of reaction, AI systems such as ChatGPT or SciSpace have the ability to assist in the structuring of a document, to define a logical layout for the presentation of information and even to produce text. For example, a researcher can provide ChatGPT with a set of key points or results that they wish to include in their article, and the AI is able to create a draft text based on this information. These tools can also help à formulate more technical sections of the article, adopting jargon and terms specific à discipline.

In other words, AI is able to produce a draft text that can be used as a starting point for action. This could help à overcome the «syndrome of the blank page» and à accélérer the process of rédaction.

However, it is imperative that the content generated is scrutinised and reworked by the researcher. The AI produces texts that appear plausible but tend to contain errors or inaccuracies. Moreover, it does not always have the ability to understand and integrate the context in an appropriate way, which is fundamental in the writing of research articles.

Moreover, these algorithms are not currently capable of reproducing the creativity and originality that are specific to human beings. Although AI is a powerful tool for generating text, it is not capable of conceiving new ideas or perspectives. It can help to automate certain parts of the research process, but it is not capable of replacing the intellectual and inventive contribution of researchers.

Artificial intelligence-based reaction assistants, such as Grammarly, ChatGPT and DeepL, offer a variety of services to improve and facilitate the reaction process. They can be used for everything from helping to improve grammar and style, to managing initial text, to suggesting structures for research documents.

As part of improving writing style, these tools are capable of analysing a text and identifying grammar, spelling and punctuation errors, as well as awkward or confusing sentences. They are also able to suggest rewordings or improvements to make the text clearer, more concise and more engaging. These suggestions could help to improve the quality of the writing and maintain a consistent tone and style throughout the document.

When it comes to structuring a document, some reaction assistants can suggest a logical organisation of ideas and results. They are able to help define a clear structure for a research article, identifying where to introduce the different sections, how to organise the arguments and how to present the results in a coherent and convincing way. This seems particularly useful when writing long and complex research papers.

However, it is important to note that, despite all these useful features, any text generated or enhanced by an AI-based interaction assistant must be carefully reviewed and edited by the researcher.

The platforms and frameworks around artificial intelligence (for example Tableau, Power BI, TensorFlow, Keras or Scikit-leutearn) have revolutionised the way researchers can examine large quantities of data. They are particularly useful in fields such as bioinformatics, climate study, sentiment analysis in social media and other areas that generate massive amounts of data.

Firstly, AI is able to help structure and organise data. This can mean classifying data into appropriate categories, identifying groups or even revealing complex relationships between different variables. For example, machine learning algorithms such as clustering or classification are known to organise data into groups on the basis of common characteristics.

Secondly, AI solutions are able to identify trends and patterns in data. Techniques such as deep learning have proven their ability to assimilate complex representations of data and make accurate predictions based on these representations. In particular, to recognise consistent patterns or to make predictions based on existing data.

There is also a wide range of software available to visualise data in a clear and accessible way: generate graphs, diagrams and other visualisations that make it easier to understand trends and relationships in data.

However, although these tools offer standard advantages for analysing large quantities of data, they also have constraints. They are highly dependent on the quality of the training data, and their ability to produce reliable and relevant results is compromised if the data is biased, incomplete or erroneous. In addition, these artificial intelligence models are often seen as black boxes - they are capable of making accurate predictions, but it is sometimes difficult to grasp how they arrived at them;eeacute;re how they arrived à these preédictions due to the complexity&eegrave;les of the modèles, their non-linacute;arité proprietary software, among other things.

The integration of artificial intelligence into the search process offers a variety of benefits that have the potential to fundamentally transform the search landscape. These solutions can provide valuable assistance at every stage of the search process, leading to improved efficiency, accuracy and search dynamics.

When gathering data, many solutions, such as IBM Watson Discovery, are able to help automate information gathering, structure unstructured data and clean it for further analysis. Tools such as OpenRefine have proven their effectiveness in optimising databases, enabling more efficient search and information extraction.

At the data analysis stage, AI, with platforms such as DataRobot, is proving powerful in facilitating the processing of large data sets, detecting patterns and trends, and providing relevant insights. machine learning algorithms are particularly effective at managing much larger quantities of data than a human would be able to process, and can uncover complex and non-linear relationships in data.

As noted, many solutions, including Grammarly, can act as a real-world assistant, helping to manage drafts, suggesting stylistic and grammatical improvements and assisting with the structuring and organisation of the article.

In terms of literature review, platforms such as Semantic Scholar offer the ability to rapidly analyse a large volume of documents to potentially identify gaps in research, which can guide future research efforts.

Despite these benefits, the adoption of AI as an integral part of research is not without its challenges. It is crucial that researchers maintain a critical sense and basic understanding of the software they are handling (e.g. to evaluate the results generated, understand the limitations and be aware of the potential biases introduced).

What are the limits of AI tools in research activity?

The use of AI tools like Elicit to generate literature reviews in manuscripts offers notable benefits, such as time savings and increased efficiency. However, this practice also involves significant risks.
Firstly, if many researchers rely on the same AI tools for their literature reviews, scientific articles may become uniform and lack originality. This can reduce interest and diversity of perspectives in the scientific literature, making articles less engaging to read and analyze.

Secondly, there is a limitation in source accessibility. These tools primarily rely on free and accessible databases such as Semantic Scholar. They generally do not have access to articles behind strict paywalls, limiting the richness of available sources for literature reviews. For instance, Elicit indexes abstracts, open access articles, and preprints, but strictly paid articles often remain inaccessible without institutional or individual subscriptions.

Finally, excessive reliance on AI tools for literature reviews may lead to a decline in researchers' critical thinking skills. This could reduce their ability to analyze existing work in depth. The exercise of synthesizing literature is essential for identifying gaps, controversies, and future research directions.

AI tools are based on pre-programmed algorithms that analyse data in a very specific and terministic way;They can learn from the data and adjust their behaviour accordingly. They tend to focus on certain terms and expressions without taking into account the overall meaning of the prompt text. This can produce responses that are inconsistent and strongly detached from the original context, especially when the context is complex or subtle. Irony and sarcasm, for example, appear to be arduous tasks for the programme, as they require an understanding of the context that goes beyond the mere literal meaning of the words.

Similarly, in the analysis of complex data, AI might be able to identify trends and patterns, but be unable to understand the underlying context that gives meaning to those trends. For example, when analysing economic data, algorithms may detect a downward trend in prices, but they may not be able to grasp that this decrease is the result of an increase in production, unless this information is explicitly encoded in the data it is analysing.

The ability to base the result produced on reliable sources varies according to the tools. ChatGPT, Gemini, Microsoft Copilot and Claude are capable of citing sources with their DOI for established scientific concepts, but they still struggle to offer reliable sources for points of reflection and can even sometimes go so far as to produce fictitious sources. Yet, since the rigorous referencing of sources is a pillar of academic research, it is not enough to attribute credence to the original work of researchers, but also to provide a clear trajectory for verifying and reproducing their results.

That said, given that the field of artificial intelligence is constantly and rapidly evolving, efforts are being made to fill this gap. More advanced models such as GPT-4, with the help of plugins connected to the Internet, can be trained to identify and indicate when information is coming from a particular source. This functionality could, for example, be used to manage quotations automatically. However, ensuring the accuracy and reliability of these new capabilities remains a major challenge. In other words, although these methods can detect that information has come from a precise source, they may not be able to do so with 100% accuracy.

Moreover, even if AI were able to quote sources correctly, this would not replace the human ability to assess their relevance and reliability.

The quality of the information produced depends heavily on the quantity and quality of the data sources used for training, as well as the human adjustments made during the process. A critical approach to the results is therefore essential. Indeed, AI tools reproduce and even amplify the biases and prejudices found in their training data.

Machine learning, a branch of artificial intelligence, operates by discerning patterns within sets of data. If these data contain biases, these will be incorporated into the models. For example, if the algorithms are trained on a body of text that is predominantly male, western and science-focused, the findings of this AI will favour these views.

This can have significant consequences, especially when AI is employed in sensitive contexts such as automated decision-making. For example, a book recommendation system based primarily on works by Western authors could fail to suggest books by authors from other parts of the world, resulting in an unbalanced cultural representation.

In addition, biases in training data can reinforce existing patterns. If, for example, training data disproportionately associate specific third parties with a certain gender, these systems will propagate these patterns in their conclusions.

To minimise these risks, it is essential to pay particular attention to the quality and diversity of the data provided as input to AI tools. Researchers must also be aware of these potential biases and take them into account when using these tools. For example, by employing various methods to examine and validate the results obtained, or adopting complementary approaches to balance the perspectives presented by AI.

AI tools sometimes generate statistical hallucinations, i.e. results that seem entirely plausible, when in fact they are inaccurate or inappropriate, which creates a major challenge in verifying the content proposed by the AI. For example, the invention of statistics for a university without any real data on the subject, the creation of fictitious scientific terms or the generation of non-existent or erroneous biographical data.

These errors can have a variety of causes: the model, for example, may misinterpret the data it is training on, or develop incorrect information based on poorly acquired schemas. It is also important to note that these models operate probabilistically. They predict the next word in a sequence based on probabilities computed by neural network architectures, such as those of the transform type. This makes them likely to produce results that are credible on the surface, but not necessarily aligned with reality.

These statistical hallucinations are particularly problematic when these systems are exploited to analyse huge volumes of data or to create sophisticated content. In such cases, it is difficult to verify the accuracy of each piece of information produced. This can lead to errors or confusion if users rely on the results without proper critical examination.

It is therefore vital to reduce the associated risks by verifying the content created by AI, relying on additional sources of information to test the accuracy of the results;This will be achieved by using additional sources of information to test the results, and by generally exercising a critical approach to the use of AI tools.

The lack of originality in the content developed by generative AIs is generally described as stochastic parrot, because their basic mechanism consists of constructing text based on patterns observed in training data;canism consists of constructing text based on patterns observed in training data, without devising any innovative or independent content.

Language processing models such as GPT are trained on standard corpora of text and learn in advance which word or phrase is likely to follow depending on the context. This way of managing text is essentially based on probabilities and does not interpret meaning or context in the way that a human would. So, even if these models can compose logical and well-articulated text, they do not conçove truly new or original ideas.

However, there is a dispute within the AI community as to whether these models are considered creative or not. Some argue that, even if the content produced by AI is based on patterns that it has assimilated, it can combine them in singular or unexpected ways, which can be seen as a form of creativity. For example, a language model might devise a poetic metaphor or an inventive expression by combining elements from different contexts that it has learned.

Notwithstanding this, it remains clear that AI is no substitute for human ingenuity and originality, since it is not capable of making the most of human knowledge;is not able to have creative intentions or generate new ideas in the same way as the human mind.

The simplicity of using artificial intelligence programs in research can create a dependency, which can lead to several consequences.

Firstly, there is the fear of a weakening of skills among researchers and students. Constant use of AI applications to carry out tasks that they would otherwise have to perform themselves may lead to a loss of traditional research skills. Researchers and students may become less adept at performing tasks such as analysing data, formulating hypotheses and conducting research;This weakens their ability to carry out critical and independent analysis, a fundamental skill in research. Added to this could be an excessive confidence in the results produced by AI, induced by this dependence.

The quality of the instructions given by the user strongly influences the quality of the results obtained.
See: What interests are there in integrating AI tools into teaching?

In their first versions, creativity and empathy were not characteristics seen in AI tools. The texts produced were significantly lacking in these qualities and tended to be banal. Today, although the free version of ChatGPT (GPT 3.5) has a style that can be likened to that of a robot, the advances made with models such as GPT-4 mark a significant turning point. In this more recent version, the distinction between a text rewritten by a human and another generated by AI has become almost indistinguishable.

Plagiarism and fraud

Legislation relating to plagiarism does not apply to the use of general AI for academic work. Nevertheless, Directive 0.3 of the UNIL stipulates that all work must be authentic, and it is with this in mind that it is important to remember that the rendering of a piece of work must be authentic;It is with this in mind that it is important to remember that it is a serious breach of the principles and rules in force in relation to academic integrity to render work produced by an AI without mentioning this usage.
Interview with Philippe Gilliéron

The rapid development of AI and its growing use in research and scientific publications raises a whole raft of complex ethical and legal issues. The use of AI makes the detection of plagiarism and the establishment of authorship particularly complex. Current copyright legislation is struggling to keep up with the meteoric rise of technology and to address these issues. It is therefore imperative that we continue to reflect on these issues and on how the law can evolve to meet these challenges. In all cases, transparency is essential when using this type of tool, in order to best respect the principles of scientific integrity.

AI has the ability to rewrite an existing work in such a way as to make it unknowable to traditional plagiarism detection tools. This potential raises major ethical problems, in that it allows an original work to be exploited without giving any credit, in contradiction with the principles of scientific integrity and intellectual property law. As the CNRS explains, plagiarism of published texts ranges from more or less enlarged copies without any appropriate design to direct borrowing or paraphrasing (CNRS ethics committee PDF).

As regards authorship, the challenge lies in identifying the author of a work produced by AI. Historically, copyright protected works created by the human mind, but with AI, this definition is being called into question. According to Philippe Gilliéron, professor of law at UNIL and a lawyer, the mind or text of another person is still the prerogative of the human being. To this day, the use of texts generated by systems such as ChatGPT does not qualify as plagiarism, simply because the system has not yet been recognised as having a "spirit"as such.

The essential point remains the angle from which this problem is approached: it is not just a question of copyright, but also of scientific interest. In any case, presenting a text created by AI as the work of one person, without mentioning the use of associated tools, is misleading and contrary to ethics.

How can the use of AI tools in research be made acceptable?

As is already largely the case for the analysis of statistical data sets, the commands and procedures used should be clearly set out in order to meet reproducibility standards.

With this in mind, publishers such as Elsevier, Springer and Cambridge University Press have already taken a stance on the issue, providing clear guidelines:

Another key component of transparency is the description of the potential limitations and sources of imprecision associated with the use of these systems. This may include discussion of issues such as bias in learning data, difficulties in verifying AI-generated content, or limitations in contextual understanding. By providing this level of detail, researchers help prevent misunderstandings and set realistic expectations about what these tools can and cannot achieve.

In addition, clarification is essential in determining ethical responsibility when using AI. This means clearly specifying which part of the work has been carried out by these programmes and which part by the researchers. Distinguishing between the work carried out by the machine and that carried out by individuals helps to identify where responsibility for the results of research lies and to ensure that AI is used ethically and responsibly.

It is important to distinguish between the work carried out by the machine and that carried out by individuals.

In short, clarification and transparency are essential to preserve the integrity of research, to build public trust and to ensure that artificial intelligence tools are used appropriately in research.

The few publishing houses that have looked into the matter generally recommend that authors disclose the use of AI tools in a disclaimer section at the end of the manuscript, just before the references. This declaration should include at least the name of the AI tool used, the version (or year, if applicable) and the specific sections of the document where this technology has been employed.

No. Authorship requires taking responsibility for content, consenting to publication via an author's publication agreement, and contractual guarantees on the integrity of the work, among other things. These responsibilities, which are intrinsically human, cannot be assumed by AI. Consequently, AI tools must not be created as authors'e·s.

No. When an expert is asked to review a manuscript, it is imperative that the confidentiality of the content is maintained. Reviewers must not download the manuscript or any part of it into a managed IA tool, as this may infringe the confidentiality and property rights of the publishing house and the authors. In addition, if the manuscript contains personally identifiable information, this may breach data protection rights.

In addition, the strictest integrity required in the review process requires responsibilities that can only be undertaken by humans. AI technologies should not be used by reviewers to facilitate the evaluation of the manuscript, as the critical reflection and original analysis required for this is beyond the capabilities of these technologies. There is also a risk that these technologies may produce erroneous, incomplete or biased conclusions.

In general, no. The rapid development of the field of image creation by generative AI raises new legal issues relating to copyright and research interest. As long as these issues surrounding images and videos generated by AI
remain largely unresolved, there is a risk that they will be used in the future. remain largely unresolved, publishers cannot authorise their use for publication.

That said, there is some tolerance regarding the use of AI-generated images in more informal communication contexts, such as flyers, posters, emails or event web pages.

Please note: not all AI tools are necessarily general-purpose. The use of non-genetic machine learning tools to manipulate, merge or enhance existing images or figures should be mentioned in the appropriate legend when submitting, allowing for a case-by-case assessment.

Note: not all AI tools are necessarily generative.

Yes, it is possible to use AI tools to digest scientific funding applications, but certain essential precautions must be taken. Applicants must take full responsibility for the content generated and respect the principles of scientific integrity and data confidentiality. They must also consider the ethical and legal implications, particularly in relation to intellectual property and the management of sensitive data. Synthesising or translating entire queries using managed AI tools may contravene the principle of confidentiality, as it is prohibited to transmit data to unauthorised third parties such as AI application providers.

The SNSF, for example, stresses the importance of these precautions and guarantees the strictly confidential treatment of projects.

Yes, there are European guidelines for the responsible use of genetic AI in research. On 20 March 2024, the European Commission, together with the countries of the European Research Area, issued recommendations to support researchers. They encourage integrity and a coherent approach, warn of the risks of plagiarism, disclosure of sensitive information, bias, and emphasise transparency and accountability.

The principles set out in the UNIL FAQ, formulated by the Research Department prior to these European recommendations, remain valid and are in line with these broad guidelines.