Bruno Oberle - Projects

Coreference Annotation Tool (SACR)

A tool for annotating mentions and coreference relations using a simple drag-and-drop interface.

Optimized for ease of annotation, it provides a visual representation of relations, numerous shortcuts, and search options.

Supports feature annotation for each mention, such as part of speech, grammatical gender, number, etc. The annotation set is fully customizable.

Allows conversion of annotations to and from various formats, including CoNLL and TXM.

A single-page application developed in Vanilla JavaScript.

read more github read the paper tutorial on Youtube use it now!

Automatic Coreference Resolution for Spoken and Written French With AI (`cofr`)

A coreference resolution model trained on the Ancor and Democrat corpora.

The model detects mentions, including singletons, as well as coreference relations.

The model is the result of fine-tuning BERT, one of the first large language models (LLMs). TensorFlow was used.

This work was published in LREC 2020.

github read the paper

Automatic Coreference Resolution for Spoken and Written French With AI (<code>cofr</code>)

Lang Track App (Next Generation): A survey app

A project designed to enable a research team to send small but frequent surveys to participants via a mobile app and a website.

It includes a mobile app for Android and iOS, a back-office system, and a backend.

Reminders are sent via push notifications, emails, and/or SMS.

Developed using Python, React, and React Native.

Currently in the testing phase.

Conversion scripts for various coreference annotation formats (`corefconversion`)

A collection of scripts and tools for converting between various formats used in coreference annotation.

Supported formats include CoNLL, Brat, JSONLines, Glozz, SACR, plain text, and more.

Most scripts are written in Python, with some in Perl.

Part of the script is available as a PyPI package (pip).

github Python package (PyPI)

Conversion scripts for various coreference annotation formats (<code>corefconversion</code>)

Ancient Greek linguistics and grammar reference sheets

A set of 150 reference sheets (approximately 420 pages) providing an analytical description of the morphology, tenses, moods, phonetics, and syntax of Ancient Greek.

They explain the different cases in great detail, with the help of diagrams.

I wrote them while studying Ancient Greek at the University of Strasbourg.

My goal was to be highly analytical in order to truly understand the underlying principles. Rather than simply memorizing declensions, these sheets help you grasp why the endings are what they are.

Ancient Greek linguistics and grammar reference sheets

Vinted Downloader (Firefox Extension)

A simple Firefox add-on for downloading data and full-size photos from the Vinted app.

Allows downloading product data in JSON and text formats, as well as full-size photos.

Also supports downloading conversation data, including content, full-size images, and shipping (tracking) information.

github Firefox extension

Vinted Downloader (Python package)

A Python script for downloading data and full-size photos.

Downloads product data in JSON and text formats, along with full-size photos.

Supports downloading conversations, including content, full-size images, and shipping (tracking) data.

Available on PyPI (pip) and can be invoked using the command vinted-downloader.

github Python package (PyPI)

Coreference databases and corpora for English and French (`corefdb`)

A database containing coreference data (mentions, chains, and relations) and textual structures (tokens, sentences, paragraphs, and texts), enriched with linguistic annotations (e.g., part of speech, named entities, etc.).

This is an enhanced version of the Democrat corpus for French.

Python scripts are available to create custom databases from other corpora (e.g., CoNLL or user-provided annotations).

github

Coreference databases and corpora for English and French (<code>corefdb</code>)

Coreference Analysis Tool (CRViewer)

A tool for computing coreference statistics and visualizing them using pie charts and bar plots.

The tool is written in Java and processes input files annotated with SACR.

It supports other annotation formats (e.g., CoNLL) with conversion.

github tutorial on Youtube

Random item generator

A website for quickly generating random items such as names, emails, cities, HTML text, pictures, locations with maps, user profiles, and more.

You can generate simple lists or more complex structures with loops and groups.

All data is sourced from Wikipedia and has been randomized.

go to website

Tree Visualization of a Dependency Parser (`dependency2tree`)

This tool converts the CoNLL output from dependency parsers, such as StanfordNLP (for English) or Talismane (for French), into LaTeX or Graphviz tree representations.

github

Tree Visualization of a Dependency Parser (<code>dependency2tree</code>)

Rephraise: Using IA to reformulate a text

A tool for rephrasing text, with options to adjust formality and maintain varying levels of similarity to the original.

The best part is that you can instantly see the differences, similar to a GitHub or GitLab diff.

I use it to refine my support emails to customers. 😅

It simply sends data to the OpenAI API.

Available in French and English.

Rephraise: Using IA to reformulate a text

Visual Representation of Coreference Relations

Various ways for representing coreference relations between linguistic expressions in a text.

Visual Representation of Coreference Relations

NASA and USGS raw Elevation Models as images (`hgt2pnm`)

This Python script converts raw elevation models from NASA and USGS into a PNM image.

HGT files contain elevation data provided by USGS and NASA. Each pixel represents either 1 or 3 meters, depending on the region.

The data is provided as a 16-bit signed integer in Motorola byte order.

This tool converts the data into a standard Linux byte order and outputs an image.

github

NASA and USGS raw Elevation Models as images (<code>hgt2pnm</code>)

Draw a Frame on your Screen (`drawframe`)

A program that draws a frame appearing above all other windows, useful for recording screencasts.

This tool is written in C and designed for Linux systems using X11.

github

Draw a Frame on your Screen (<code>drawframe</code>)

Visual Timeline Creator (`mktimeline`)

A tool for creating visual timelines from a list of dates and event names.

This lightweight tool is written in Perl 5.

github

Visual Timeline Creator (<code>mktimeline</code>)

Ancient Greek Font

An Ancient Greek font, created with FontForge, that resembles the one used by a famous French publisher.

Laboratory DC Power Generator

How I built a laboratory DC power generator (2 × 1.2V to 20V, 333mA).

Projects

Coreference Annotation Tool (SACR)

Automatic Coreference Resolution for Spoken and Written French With AI (cofr)

Lang Track App (Next Generation): A survey app

Conversion scripts for various coreference annotation formats (corefconversion)

Ancient Greek linguistics and grammar reference sheets

Vinted Downloader (Firefox Extension)

Vinted Downloader (Python package)

Coreference databases and corpora for English and French (corefdb)

Coreference Analysis Tool (CRViewer)

Random item generator

Tree Visualization of a Dependency Parser (dependency2tree)

Rephraise: Using IA to reformulate a text

Visual Representation of Coreference Relations

NASA and USGS raw Elevation Models as images (hgt2pnm)

Draw a Frame on your Screen (drawframe)

Visual Timeline Creator (mktimeline)

Ancient Greek Font

Laboratory DC Power Generator

Automatic Coreference Resolution for Spoken and Written French With AI (`cofr`)

Conversion scripts for various coreference annotation formats (`corefconversion`)

Coreference databases and corpora for English and French (`corefdb`)

Tree Visualization of a Dependency Parser (`dependency2tree`)

NASA and USGS raw Elevation Models as images (`hgt2pnm`)

Draw a Frame on your Screen (`drawframe`)

Visual Timeline Creator (`mktimeline`)