Skip to content

GSOC 2024 ideas list

Oliver Kopp edited this page Feb 25, 2024 · 41 revisions

JabRef in Google Summer of Code 2024

We strongly believe in open source and provide interaction with a diverse community. JabRef aims to provide a welcoming experience to open source newcomers. We have three years of GSoC participation with great results. All of them are huge steps towards a well-usable research tool.

Participants will grow their technical, coding, and their open source experience. Finally, participants will expand their professional network.

Projects

This page lists a number of ideas for potential projects to be carried out by the persons participating in Google Summer of Code 2024. This is by no means a closed list, so the possible contributors can feel free to propose alternative activities related to the project (the list of feature requests and the GitHub issue tracker might serve as an additional source of inspiration). Students are strongly encouraged to discuss their ideas with the developers and the community to improve their proposal until submission (e.g., using the Gitter Channel or the forum). It's also a good idea to start working on one of the smaller issues to make yourself familiar with the contribution process.

Improve handling of ancient documents by OCR and AI

JabRef, a comprehensive literature management software, currently supports both handling metadata and text-based PDF documents. However, a significant limitation arises with scanned PDFs, particularly historical articles, which are not text-searchable due to their image-based format. This project aims to bridge this gap by integrating advanced OCR (Optical Character Recognition) technology, enabling full-text search in scanned PDFs.

Expected outcome:

A) Develop a common interface within JabRef to accommodate multiple OCR engines, ensuring flexibility and expandability. B) Enable expert users to fine-tune OCR settings, catering to specific needs or document formats.
C) Incorporate the OCR-extracted text as a searchable layer in PDFs, allowing Apache Lucene to index and look for the content.

Skills required:

  • Java, JavaFX
  • Curiosity

Possible mentors:

@Siedlerchr, @koppor

Project size:

175h (medium)

AI-Powered Summarization and "Interaction" with Academic Papers

This project aims to revolutionize the way researchers interact with academic literature in JabRef, utilizing the power of Artificial Intelligence (AI) to enhance user experience and efficiency. The goal is to implement an AI feature allowing users to request a) summaries of PDF documents directly within JabRef and b) ask questions based on the "knowledge" inside the local PDFs. Ideally, the solution should work locally without any external Cloud service.

Expected outcome:

Phase 1 (90h): Develop a module to connect JabRef with configurable online AI services that can generate summaries of academic papers and answer questions. Ensure this feature is user-friendly, allowing for seamless interaction (summary, asking questions) and customization according to user preferences. It has to be possible to ask questions covering selected (or even all) PDF files of a local library (.bib file with attached .pdf files).

Phase 2 (+90h): Develop a module to connect JabRef a local AI service that can generate summaries of academic papers and answer questions. Ensure this feature is user-friendly, allowing for seamless interaction (summary, asking questions) and customization according to user preferences. There must not be any remote connection. It has to be possible to ask questions covering all PDF files of a local library (.bib file with attached .pdf files).

Possible Mentors:

@koppor, @Siedlerchr

Project size:

  • Phase 1 only: 175h (medium)
  • Phases 1 and 2: 350h (large)

Welcome Walkthrough

This project aims to create an engaging and informative first start screen for JabRef, enhancing the initial user experience and showcasing the best features of the software. This screen will differ from the standard interface displayed when no database is open, providing a tailored introduction for new users.

Expected Outcome:

  1. The welcome dialog should ask for: Configuration of Paper Direction, Integration of Online Services (Grobid, Telemetry), Creation of Example Library, Community Engagement Tool, Link to Donation page
  2. The welcome group should offer some sensitive User Group-Specific Defaults: Offer pre-configured default preferences catering to different user groups, such as "relaxed users" wanting all features, and "pro-users" who prefer managing BibTeX files without additional features (as per Issue #9491).

(These are just ideas, during the project, this needs to be refined)

Skills required:

  • Java, JavaFX

Possible Mentors:

@koppor, @tobiasdiez

Project size:

  • 175h (medium)

Improved SLR Support

Description:

With the ever-growing number of publications in computer science and other fields of research, conducting secondary studies becomes necessary to summarize the current state of the art. For software engineering research, Kitchenham popularized the systematic literature review (SLR) method to address this issue. The main idea is to systematically identify and analyze the majority of relevant publications on a specific topic. This is usually an activity that takes extensive manual effort. Some tool support does exist, but the full potential of tools has not been exploited yet. JabRef also offers basic functionality for systematic literature reviews that is used by a number of researchers to systematically "harvest" related work based on the fetching capabilities of JabRef. While using the feature, various additional feature requests came up. For instance, created search queries are currently transformed internally by JabRef to the query format of the publisher. It should also be possible to directly input a query at the publisher site, e.g., for IEEE or ACM. More information: Dominik Voigt, Oliver Kopp, Karoline Wild: Systematic Literature Tools: Are we there yet? ZEUS 2021: 83-88

One key aspect would be the improvement of the fetcher Infrastructure in JabRef to better adapt to new and changing Publisher/Journal websites and to offer a more direct integration. As an inspiration, see BibDesk

Expected outcome:

An advanced SLR functionality, where a researcher is supported to execute a systematic-literature-review.

Skills required:

  • Java, JavaFX

Possible mentors:

@koppor, @Siedlerchr, @calixtus

Project size: 350h (large)

Improved CSL Support

Description:

JabRef can connect to LibreOffice to offer premier reference management for LibreOffice. Currently, custom styles are supported. In this project, this support should be extended to offer support for the "Citation Style Language" files. A user should be able to choose the CSL style for the reference list and the citation style. Then, the LibreOffice document should adapt accordingly. To increase user experience even more, i) an CSL-based export of library entries and ii) a CSL editor should be integrated into JabRef. This would allow the creation and modification of styles. For more information on CSL refer to https://citationstyles.org/.

Expected outcome:

It is possible to select and change a CSL style for a LibreOffice document.

Possible Mentors:

@koppor, @Siedlerchr, @calixtus

Project size:

90h (small)

Clone this wiki locally