Bachelor- und Masterprojekte

Wir haben ständig neue Themen für Projekte aus den Bereichen unserer aktuellen Forschung. Kontaktieren Sie uns.

https://ad-blog.cs.uni-freiburg.de/post/a-grid-operators-dream-error-detection-with-gnns/

Improving String Function Semantics in QLever’s SPARQL Engine

Annika Greif, SS 2025, Projekt Homepage

In this project, we improved the handling of language tags and datatypes in string expressions, such as SUBSTR, UCASE, STRAFTER and CONCAT, in the SPARQL engine QLever, making it more compliant with the SPARQL specification. Prior to this project, QLever discarded all language tags and datatypes when processing literals in string expressions.

An efficient external R-tree for very large datasets

Noah Nock, WS 2024, Projekt Homepage

Information retrieval on large datasets is fundamental to our modern world. For the purpose of finding certain datapoints within a query region on geo spatial datasets, one can use numerous different algorithms and data structures. Probably the best pick for this application is the R-tree. While the basic concept of the R-tree is widely known, there is a lack of efficient algorithms that can build a R-tree on very large datasets without exceeding the capabilities of the working memory. In this project I designed and implemented a R-tree that can work on datasets of every size, while maintaining fast querying times and memory efficient building.

Extension for QLever: Implementing missing SPARQL Expressions

Hannes Baumann, WS 2024, Projekt Homepage

QLever is a database query language engine that follows the SPARQL 1.1 Query Language standard. It provides an efficient and highly optimized backend for querying RDF (Resource Description Framework) databases based on descriptive triples. Some SPARQL expressions defined by SPARQL 1.1 were still missing, the objective of my project was to implement them.

A macro-benchmarking library for the QLever SPARQL engine

Andre Schlegel, WS 2024, Projekt Homepage

Comparing execution times of different algorithm implementations can be time and work intensive. This time and work is needed to cover every possible, generalized situation for the different algorithm implementations, otherwise, no true statements can be made about the execution times of different algorithm implementations. To make covering every possible, generalized situation for the different algorithm implementations easier, I’ve built an internal macro-benchmarking library for the QLever SPARQL engine. The internal macro-benchmarking library for the QLever SPARQL engine offers a simple method for measuring execution time in seconds, multiple ways to organize measured execution times, metadata support, and support for user-defined runtime configuration options, which should simplify the generation of measured execution times.

A Grid Operator's Dream: Error Detection with GNNs

Carl Wanninger, WS 2024, Projekt Homepage

Topology maps of distribution grids are not always 100 % accurate. Nevertheless, precise topology knowledge is required for efficient grid operation and grid expansion. At Fraunhofer ISE an approach based on Graph Neural Networks was developed in order to verify given grid topologies. This Master’s Project evaluates the algorithm in more realistic scenarios and analyzes possible improvements.

Enhanced Visualization of Geospatial Data

Tobias Bürger, SS 2024, Projekt Homepage

Petrimaps is a powerful tool for visualizing geospatial data on a map. So far it has exclusively been using the results of QLever, a tool for processing SPARQL queries. In this project we will see how we can add support for reading geospatial data from a GeoJson file, display a progress bar, and create a UI that reuses parts of the existing QLever UI for consistency.

Deep Knowledge Graph Question Answering

Elias Kempf, SS 2024, Projekt Homepage

Question answering systems automatically provide answers to questions posed in natural language. A more specific type of question answering is knowledge graph question answering (KGQA). Such systems rely on translating a given question into a query over a knowledge graph. These systems often generate many possible queries at once and rank them according to some heuristic. There also exist LLM-based systems that try to directly generate the desired queries. In this project, we want to finetune and evaluate pre-trained LLMs for SPARQL query generation using the Wikidata SimpleQuestions dataset.

Cracking the Black Box of Graph Neural Networks for Electrical Grid Validation

Cora Hartmann, SS 2024, Projekt Homepage

For a successful energy transition, we need accurate models of the electric grid. The Fraunhofer Institute for Solar Energy Systems ISE developed a Graph Neural Network (GNN) to identify and correct errors in grid models. However, due to the complex model architecture the behavior of the GNN is hard to interpret. Explainable AI can help network operators understand rather than blindly trust the GNN output. In this blog post several explainer algorithms will be compared that enable understanding how the GNN reaches its conclusions.

Dynamic Observation and Interruption of SPARQL Queries

Robin Textor-Falconi, WS 2023/24, Projekt Homepage

A dive into the process of designing an architecture that allows observers to interact with complex queries in real-time.

Optimizing GROUP BY in QLever using Hash Maps

Fabian Krause, WS 2023/24, Projekt Homepage

The current algorithm for evaluation of GROUP BY in the SPARQL engine QLever requires its input to be sorted. In this project, we improve the performance of GROUP BY with the aid of hash maps, which allow us to skip sorting the input.

Bill of Material Exploration Service

Johannes Herrmann, WS 2023/24, Projekt Homepage

To make business decisions based on data, the accessibility of information is essential. In the current supply chain situation and the resulting shortage of materials, the structure of products has gained importance. Bills of materials (BoMs) govern this structure. At the same time, the complexity of BoMs is constantly increasing due to global production and procurement. With classical relational data models and databases, BoM analysis is often time consuming and requires expert knowledge. To be able to evaluate these structures efficiently and reliably, the Bill of Material Exploration Service was developed based on graph technology.

Introducing WiNERLi 2.0, an extension of WiNERLi

Johanna Götz, WS 2022/23, Projekt Homepage

This project is about named-entity linking on Wikipedia, so the goal is to detect and assign possible entities in the text of Wikipedia pages. My code and its functionality are based on the work that another student did for their bachelor’s project and thesis. This previous system was named WiNERLi (short for Wikipedia Named-Entity Recognition Linking). I reimplemented and extended most of the code in an attempt to deliver a higher quality result in a more efficient manner. This new version is called WiNERLi 2.0. In the following, I will describe the functionality of WiNERLi 2.0, the differences to WiNERLi and present the evaluation results.

Spelling Error Detection using Deep Neural Networks

Stanley George, WS 2022/23, Projekt Homepage

This project aims to detect spelling errors in a given sentence using a Deep Learning approach.

Semantic SPARQL Templates for Question Answering over Wikidata

Christina Davril, SS 2023, Projekt Homepage

Translating natural language (NL) questions to formal SPARQL queries that can be run over Wikidata to retrieve the desired information is still a very challenging task. This Bachelor’s project aims to identify “semantic SPARQL templates”, i.e. syntactic SPARQL templates based on elements of the semantic structure of the NL question. When put into practice, these templates should improve the reliable, correct answering of NL questions – particularly of those for which the required SPARQL query has a surprisingly complex syntax, e.g., containing subqueries.

xs grep: A GNU grep-like executable built with x-search

Leon Freist, SS 2023, Projekt Homepage

xs grep is a GNU grep-like executable. It is built using x-search, a C++ library for fast external string search that was written in the scope of my bachelor-thesis. This project briefly describes how x-search was used to implement xs grep. Further, this project aims to introduce xs grep in a more practical way than my thesis did and to provide implementation insights into xs grep.

Predicting Ownership of Streets in OpenStreetMap Using Machine Learning

Christoph Röhrl, WS 2022/23, Projekt Homepage

In the Field of Network Infrastructure Planning, having to deploy something on private property can have a big impact in terms of cost as well as overall efficiency. This project investigates to which degree public available data from OpenStreetMap about streets and their properties can be used to predict their ownership status.

Transform PDF timetables into GTFS

Julius Heinzinger, WS 2022/23, Projekt Homepage

The current inability to extract the schedule data from PDF timetables prevents the easy use of this otherwise available data.
Though many GTFS feeds already exist, either on their transit agencies website or in some database, this python project aims to enable the extraction of this data, when such feeds are not available.

Simplified Package Delivery App

Ievgen Markhai,SS 2022, Projekt Homepage

The project aims to create a web application to optimize parcel delivery in large institutions / companies / campuses without a central concierge service.

Detection of Electric Vehicle Charging Events Using Non-Intrusive Load Monitoring

Rohit Kerekoppa Ramesha,SS 2022, Projekt Homepage

Non-intrusive load monitoring (NILM) is the process of using the energy consumption of a house as a time series, which is the sum of the consumptions of the individual appliances to predict the individual appliance’s consumption time series. NILMTK^[1] is an open-source toolkit for comparative analysis of NILM algorithms across various datasets. The goal of this project is to use NILMTK to predict the energy consumed while charging electric vehicles from overall energy consumption of a house using a synthetic dataset.

Question Answering on Wikidata

David Otte,SS 2022, Projekt Homepage

Aqqu is an efficient question answering system that was developed using Freebase. This project aims to implement Aqqu for question answering on Wikidata instead of Freebase and to improve the accuracy of previous implementations using Wikidata. In short, for a given natural language question, SPARQL queries that might answer the question are generated and ranked afterwards in order to get the query that is most likely to answer the given question.

Testing and improving Antpower on the Simbench networks

Metty Kapgen,SS 2022, Projekt Homepage

The project aims to test and improve Lukas Gebhard’s Antpower on the Simbench Networks provided by Fraunhofer ISE. In short Lukas has developed a solver that is able to produce a cheap expansion plan given a low voltage grid. In this project we aim to test his solver on a small and synthetic set of low-voltage grids called the Simbench networks. At its core, Antpower uses the ant colony optimization algorithm to produce cheap expansion plans. Those are needed as the current state of the grid can contain several overloaded components which need to be upgraded with better but more expensive cable types. This project will show that, given a large computation time, Lukas’ Antpower produces good solutions.

Wikipedia Entity Linking and Coreference Resolution in C++

Benjamin Dietrich,SS 2022, Projekt Homepage

Implementing a C++ version of an already existing Python based entity linking and coreference resolution system for Wikipedia was what this project was aiming for. The goal was to improve the runtime of the system while maintaining the already good linking results. How this goal was achieved will be discussed in this article.

Project Map Matching Mobile Phones to Public Transit Vehicles

Gerrit Freiwald, Robin Wu,SS 2022, Projekt Homepage

Map matching can be used to match a given sequence of GPS points to a digital model of the real world. ‘Traditional’ map matching, like navigation systems for cars, uses a static map for the matching. In contrast, when working with a public transit vehicle network, the ‘map’ contains the positions of each vehicle, which are highly dynamic.

Projekt Public Transit Map Matching With GraphHopper

Michael Fleig, WS 2021, Projekt Homepage

TransitRouter is a web tool for generating shapes of GTFS (General Transit Feed Specification) feeds for bus routes using the map matching approach described in the paper Hidden Markov Map Matching Through Noise and Sparseness.

TransitRouter uses the OSM routing engine GraphHopper and a modified version the GraphHopper Map Matching library that enables turn restrictions and tries to prevent inter hop turns.

GraphHopper is a fast and memory efficient routing engine written in Java. It has built in support for different weighting strategies (e.g. fastest and shortest path routing), turn restrictions (based on OSM meta data), turn costs and most important the ability to specify custom routing profiles. As public transit has quite different traffic rules we created a specialized bus profile.

The quality of the results are compared with the original GraphHopper Map Matching library (GHMM) and pfaedle a similar tool developed by the chair of Algorithms and Data Structures at the University of Freiburg.

The source code of TransitRouter is available on GitHub.

Enhancing find functionality of pdf.js

Robin Textor-Falconi, WS 2021, Projekt Homepage

An adventure into pdf.js’ text extraction, and the ways it can be improved for better find functionality.

Segmentation of layout-based documents

Elias Kempf, WS 2021, Projekt Homepage

PDF is a widely used file format and in most cases very convenient to use for representing text. However, PDF is layout-based, i.e., text is only saved character by character and not even necessarily in the right order. This makes tasks like keyword search or text extraction pretty difficult. The goal of this project is to detect individual words, text blocks, and the reading order of a PDF document to allow for reconstruction of plain text.

Energy Price Forecasting

Sneha Senthil, WS 2021, Projekt Homepage

The project aims to predict energy prices for the next 24 hours, given the history of past features such as load, generation, prices and weather data. This data is downloaded from 2 data sources: ENTSOE and Copernicus. MLP’s, residual networks and LSTMs are trained with different hyperparameters, different subsets of features and different histories and the results are compared. The residual MLP has the best results. This is likely to due to the fact that the residual MLP can adapt to changing prices better than the other models.

jSPINE - A Java Library implementing EEBUS SPINE

Martin Eberle, WS 2021, Projekt Homepage

In a world with rising energy demands, power limitations by grid infrastructures and fluctuating energy supply of renewable energy sources, the communication between devices which consume electrical power is a key in distributing electricity efficiently. The EEBus Initiative e.V. provides a free and open, standardized language aiming to solve the barriers in communication between devices of different manufacturers.

Spelling Correction and Autocompletion for Mobile Devices

Ziang Lu, SS 2021, Projekt Homepage

A virtual keyboard is a powerful tool for smartphones, with which users can improve the quality and efficiency of the input. In this project, we will explore how to use n-gram models to develop an Android keyboard which gives accurate corrections and completions efficiently.

Named Entity Disambiguation with BERT

Amund Faller Råheim, SS2021, Projekt Homepage

Large transformer networks such as BERT have led to recent advancements in the NLP field. The contextualized token embeddings that BERT produces should serve as good input to entity disambiguation, which benefits from context. This master project aims to use BERT on the task of Named Entity Disambiguation.

Sentence Segmentation

Krisztina Agoston, SS2021, Projekt Homepage

Sentence segmentation is a basic part of many natural language processing (NLP) tasks. The leading NLP Python libraries spaCy and NLTK offer pre-trained models for that. These models often fail on a specific domain. The goal of this project is to measure the performance of these libraries and compare their results with a custom made LSTM model on special domains like Wikipedia or arXiv.

Automated Generation of Rail Noise Maps

Urs Spiegelhalter, SS2021, Projekt Homepage

Given a geographical network of railroads and OpenStreetMap data, automatically generate rail noise maps from this data. The results can be shown in a web application where hourly time spans, which represent typical rail schedules, can be compared.

Simplke Question Answering on Wikidata

Thomas Goette, WS2020/21, Projekt Homepage

Aqqu translates a given question into a SPARQL query and executes it on a knowledge base to get the answer to the question. While the original Aqqu uses Freebase and some additional external sources, this new version uses nothing but Wikidata.

Tokenization repair using Transformers

Sebastian Walter, WS2020/21, Projekt Homepage

This project tackles the tokenization repair problem using the Transformer neural network architecture. We achieve results that match the performance of previous work on multiple tokenization repair benchmarks paired with usable runtimes in practice.

Gantry

Axel Lehmann, SS2020, Projekt Homepage

Gantry allows docker-compose-like deployments with wharfer. wharfer is a replacement for the docker executable used and maintained by the Chair of Algorithms and Data Structures. Additionally, gantry allows executing containers in a sequential order. Within this sequential order, gantry tries to execute as many containers in parallel as possible. This minimizes overall execution time but keeps results deterministic.

ClueWeb Entity Linking

Pablo de Andres, SS2020, Projekt Homepage

In this project, Named Entity Recognition and Disambiguation is carried out on the ClueWeb12 dataset.

POLUSA: A large Dataset of Political News Articles

Lukas Gebhard, SS2020, Projekt Homepage

I present POLUSA, a dataset of 0.9M online news articles covering policy topics. POLUSA aims to represent the news landscape as perceived by an average US news consumer. In contrast to previous datasets, POLUSA allows to analyze differences in reporting across the political spectrum, an essential step in, e.g., the study of media effects and causes of political partisanship.

Named Entity Recognition and Disambiguation

Yi Chun Lin, SS2020, Projekt Homepage

This project uses Wikidata as the target knowledge base and aims to improve the speed and correctness of recognition and disambiguation of named entities. It is evaluated on CoNLL-2003 benchmark. A configurable framework is designed to observe the effectiveness of each part of the algorithm.

River Maps

Jianlao Shao, WS2019/20, Projekt Homepage

In this project, we extract the waterway system from OpenStreetMap data and render it into a way which makes it possible to track the river tributaries. We use the tool LOOM to display the relationship between waterways.

Bitcoin Trading App

Johannes Hermann, WS2019/20, Projekt Homepage

By anticipating market movements ahead of time it is possible to generate profits. This application implements several trading strategies which aim to do that automatically.

Circular Transit Maps

Jonathan Hauser, SS 2019, Projekt Homepage

Transit Maps can be found in many places. By replacing the actual road geometry with a simpler geometry like arcs the result not only becomes more aesthetically pleasing but also more readable. Nonetheless, the original road layout shouldn’t be left completely unconsidered to avoid confusion when reading the map.

UniPal (A chatbot for the course catalogue of the Uni Freiburg)

Pascal Muckenhirn und Tanyu Tanev, SS 2019, Projekt Homepage

UniPal is a chatbot design for simplify the finding of information about your courses and tutorials. Instead of crawling the web, like students normally do to get information, you can now just ask UniPal and it'll fetch the information for you - just like any of your best pals. If you're interested in our project, be free to check out our project homepage - you can directly contant UniPal on it!

Tokenization Repair Using Character-based Neural Language Models

Matthias Hertel, WS 2018/19, Projekt Homepage

Common errors in written text are split words (“algo rithm”) and run-on-words (“runsin”). I use Character-based Neural Language Models to fix such errors in order to enable further processing of misspelled texts.

WikiQuestions

Natalie Prange, WS 2018, Projekt Homepage

Systems based on machine learning such as Question Answering or Question Completion systems require large question datasets for training. However, large question datasets that are not restricted to a specific kind of question are hard to find. The WebQuestions (Berant et al., 2013) and Free917 (Cai & Yates, 2013) datasets both contain less than 10,000 questions. The SimpleQuestions dataset (Bordes et al., 2015) contains 108,442 questions, but the questions are limited to simple questions over Freebase triples of the form (subject, relationship, object). The 30M Factoid Question-Answer corpus (Serban et al., 2016) contains 30 million questions, however, these questions, too, are limited to simple Freebase triple questions.

We introduce a question dataset containing 4,390,597 questions and corresponding answer entities that are generated by rephrasing Wikipedia sentences as questions. The rough pipeline of the question generation (QG) system is as follows: A Wikipedia dump with Freebase entity mentions is preprocessed by annotating entities with their types. The preprocessed Wikipedia dump is then parsed using a dependency parser. For each sentence, entities that fulfill certain grammatical criteria are selected as answer entities. A fitting WH-word is selected for an answer entity and various transformations are performed over the sentence to rephrase it as a question. Finally, the generated questions are filtered to avoid ungrammatical or otherwise unreasonable questions. The following section describes the system pipeline in more detail.

Complete Search UI

Olivier Puraye, WS 2018, Projekt Homepage

Make any table structured data file searchable using the different features of Completesearch
Automatic detection of separators
Validate entire file and output syntax error with line count
Analyse input file and determine suitable parameters for each of its columns
Build a nice and easy-to-use web app
Make all search engine settings adjustable via the web app

Concept Neurons

Joao Carvalho, WS 2018, Projekt Homepage

This webpage showcases the master project developed by João Carvalho at the Chair of Algorithms and Data Structures of the University of Freiburg, as part of the MSc degree in Computer Science.

In this project we explored the capabilities of neural language models. More precisely, we questioned if a neural network would be able to encode Part-of-Speech (POS) tags in its neurons, just by training a simple language model.

We first trained a byte-level language model with a Long Short-Term Memory (LSTM) network using a large collection of text. Then, taking a sentence, which can be viewed as a byte sequence, we used its inner representations (the cell states of the LSTM), along with its corresponding POS tags, as the inputs and targets to train a logistic regression classifier. Looking at the classifier weights, we observed that some concepts (POS tags) are encoded in one neuron, i.e., the POS tag of a byte can be derived from one neuron's activation value, while others are derived with more than one neuron together with the logistic regression classifier. For some tags, using three neurons yielded satisfactory results.

The idea for this project started from the openAI paper (Radford et al, 2017). In this article, the authors found a dimension in the cell states (a neuron) that strongly correlates to the semantic concept of sentiment, which they called the Sentiment Neuron. In this project we also replicated their results.

Football Data Extraction for Broccoli

Jonas Bischofberger, WS 2018, Projekt Homepage

The Broccoli search engine answers queries about a broad range of entities, but lacks information in more specific domains. The task was to choose an appropriate one of these domains, obtain relational and full-text data from that domain and integrate it into the current Broccoli version. For this project, data about association football players (e.g. height, birth date, current team) and teams (e.g. date of foundation) was chosen.

Search Engine for OSM Data

Iradj Solouk, WS 2018, Projekt Homepage

The actual motivation behind this project is to build an OSM-Data search engine that has comparable results as Nominatim. The aim of this project is setting up a basic web application(Parser, index structures, basic ranking and the UI) so that the application can be improved in a future work, in which one will mainly focus on the ranking functionality. In the following, the structure of the application, certain parts of it and their development will be presented. The order of presentation also reflects the order of development of the application and the logical execution sequence.

Tabular Information Extraction

Tobias Matysiak, SS 2018, Projekt Homepage

This project aims at simplifying the creation of SPARQL-queries for the knowledge base Freebase. Instead of finding out the relevant Freebase types and relations by hand, the user specifies table columns in a simple table description format.

ShapeExtraction

Mohamed Abou-Hussein, Omar Shehata, WS 2017/18, Projekt Homepage

The General Transit Feed Specification (GTFS) is a group of files that defines a common format for public transportation schedules and other geographic information. GTFS allows public agencies to publish their data to be used in the feed. The goal of the project was to build a tool (GTFS mapper) that given a GTFS feed would generate an alternative feed with the same formate and describing the same region, but using open source data. This is due to the fact that the data uploaded from the public agencies is not always sufficient.

Open Street Map (OSM) is an open source project that shows the maps of the world. It is built by volunteering users from all over the world. GTFS mapper is a tool given a GTFS feed for a certain city and the osm data representing the same city would output a new feed with the same format of GTFS, but is produced using information and coordinates from the OSM data.

Suche mit regulären Ausdrücken

Christian Reitter, WS 2017/18, Projekt Homepage

This project is evaluating the design of a "search as you type" interface for special regular expressions on a large set of scanning data.

The extensive and complex database of perl compatible regular expressions of the Nmap project was chosen as a practical application for this search. Its database consists of about 11300 regular expressions that represent the bulk knowledge about varying binary response characteristics of a large number of network protocols and software implementations, allowing practical fingerprinting on various levels of software, vendor and devices. An additional meta-data syntax provided by the nmap probe format augmented with standardized Common Platform Enumeration (CPE) identifiers allows for the categorization of additional information, including the ability to extract target-specific information with the help of capturing groups within the regular expressions.

QLever SPARQL + TEXT UI

Julian Bürklin, Daniel Kemen, SS 2017, Projekt Homepage

QLever ist a full-featured SPARQL+Text engine which returns result tuples for given SPARQL queries. The old UI is very simplistic and hard to use. The goal of our project is to create a simple, powerful and intuitive UI for QLever which supports suggestions and auto-completions.

CompleteSearch

Evgeny Anatskiy, SS 2017, Projekt Homepage

The main goal was to create an easy-to-use web application, which would take a dataset (CSV, TSV), automatically determine a separator, validate the data (remove rows with a wrong syntax), define search facets, and save the file (pre-processing). Then, use the saved file as an input for the search engine CompleteSearch, which generates indices (post-processing).

CompleteSearch does all the work on performing the search in the uploaded dataset. The web application (this project) serves as a middle layer, which processes and corrects the user input, and sends it to a separate local CompleteSearch server.

Pseudo Database for Existing XML Files

Nishat Fariha, SS 2017, Projekt Homepage

The simulation tool SmartCalc.CTM (Fraunhofer ISE intern development) uses a set of XML files as its data source. These files contain data about Material properties of components of a photovoltaic module. The choice for XML files instead of a SQL database was motivated by the human readability and the possibility to give new files for new measured material properties to customers, who also run the software.

Completesearch

Julian Löffler, Rezart Quelibari, Matthias Urban, SS 2016, Projekt Homepage

The goal of this project is to set up a web app, where one can upload any CSV dataset, and then have a convenient search (with meaningful default settings), without having to set up anything oneself.

Deepdive

Louis Retter, Frank Gelhausen, SS 2016, Projekt Homepage

This project consisted of getting to know Deepdive aswell as finding out how well it performs and finding possible use cases. Deepdive is a data-managament system which can extract entities from a given text and predicts the probablility of entities engaged in a given relation using machine-learning. These predictions are based on user-defined features, not algorithms, which is the main advantage of using Deepdive over other systems.

Efficient Code for (De)Compression

Zhiwei Zhang, WS 2015/2016, Projekt Homepage

In this project I implemented the Elias-Gamma algorithm, the Elias-Delta algorithm, the Golomb algorithm, the Variable-Bytes algorithm and the mainly optimized decompression function of the Simple8b algorithm and at the same time compared them to find out the "best appropriate algorithm(s)" for the search engine Broccoli.

Lexical Semantics

Max Lotstein SS 2015, Projekt Homepage

This project models the meanings of words based on how those words are used in a large corpus.The meaning-representation supports both equality and distance comparisons, as well as other operations from linear algebra. Some of these operations can be used for semantic comparisons and can thus be used for detection of related, and perhaps even synonymous, word pairs. Various tests of correspondence between human judgments of semantic similarity and the project’s output place it among similar systems, though not at the top..

A Mobile App for Kitchen Account Management

Simon Schonhart, Christian Reichenbach, WS 2014/2015, Projekt Homepage

The goal of this project was to implement an Android app, that can be used to manage the kitchen accounts of our staff. The app is able to manage the consumed products of a user and to debit the employee's account with the corresponding price. Moreover, the app can be used to send payment reminders to the users at regular intervals.

Automatic Recognition of Values in Wikepedia Articles

Regina König, WS 2014/2015, Projekt Homepage

The goal was to find automatically values in wikipedia articles and convert them into metric units for the semantic search engine broccoli. The value finding component runs in a chain of a UIMA Pipeline.

OSM Search

Tobias Faas, SS 2014, Projekt Homepage

Das System verwendet die Daten des OpenStreetMap-Projektes in Form einer osm.pbf-Datei ("Protocolbuffer Binary Format"). Diese beinhaltet die OSM-Entities in Binär-Form was einen schnelleren Zugriff erlaubt. Zusätzlich werden die Boundaries deutscher Städte und Landkreise aus einer Textdatei eingelesen.
Die osm.pbf Datei konnte mit Hilfe der Osmosis-library eingelesen werden. Entities, mit Tags (wie zum Beispiel: "shop=bakery"), welche in der zuvor erstellten Ontology File definiert wurden, werden herausgefiltert und in einem Index abgespeichert.

Entity-Component System

Jochen Kempfle, Markus Reher, WS 2013/2014, Projekt Homepage

The provided Entity-Component-System was implemented as the so called ESE-Project at the Chair of Algorithms and Data Structures at the University of Freiburg.

Wikification

Ragavan Natarajan, SS 2013, Projekt Homepage

Wikification is the process of identifying the important phrases in a document and linking each of them to appropriate articles on Wikipedia based on their context of occurrences. The important phrases in a document are also called keyphrases, somewhat similar to the term keyword, but unlike a keyword , a keyphrase can consist of one or more words. A wikifier is a software that performs the wikification of a document. One such software has been developed and made available here. This page discusses how it was developed from the ground-up. A more detailed report is available here.

Relation Extraction

Anton Stepan, Marius Bethge, 2012/2013, Projekt Homepage

This project dealt with improving the location-based data found in the YAGO knowledge base, which is used by the the semantic full-text search engine Broccoli.

In order to solve this task we used the data provided by the GeoNames geographical database and composed the program GeoReader which extracts the relevant information and creates valid relation files in the format used by Broccoli.

Manual Feature Engineering with 3D Motion Capture Data

Benjamin Meier, Maria Hügle, WS 2012/2013, Projekt Homepage

This project is about manual feature engineering with 3D motion data recorded by the 3D kinematics measurement system Xsens MVN system.

Spider Data Projector Control

Rainer Querfurth, Josua Scherzinger, WS 2012/2013, Projekt Homepage

Das Spider VPC Projekt ist eine bereits bestehende Software zur Steuerung der Projektoren an der technischen Fakultät der Universität Freiburg. Hierbei handelt es sich im Besonderen um ein zentral nutzbares Monitoring Tool, einen XML Creator zum Erstellen benötigter XMLSettings Dateien, sowie dem eigentlichen Spider VPC Programm, welches auf den Steuergeräten selbst aufgespielt ist.

Das Spider VPC Programm nutzt das .NET Micro Framework von Microsoft. Hierbei handelt es sich um die teilweise Portierung der .NET Bibliotheken in die Welt der Microprozessoren. Daher wurde zur Programmierung u.a. Visual Studio von Microsoft genutzt. Bei den verbauten Steuergeräten handelt es sich um GHI Spider Kits.

Wir führen dieses Projekt als zweite Gruppe fort, wobei wir uns dabei auf die Erweiterung des Monitoring Tools, die Erweiterung der Steueroptionen der Spider VPC, sowie die Wartung aller verbauten Spider konzentriert haben. Auf dieser Seite möchten wir kurz unseren Teil des Projektes vorstellen.

Multicriteria Multimodal Routeplanning

RobinTibor Schirrmeister, Simon Skilevic WS 2012, Projekt Homepage

The goal was to write a program that can perform shortest path queries in a hybrid network of public transportation lines and roads. These queries should use both the time of the path as well as the amount of transfers as criteria, thus "multi-criteria".
The program should also be able to create some basic visualization of the Dijkstra computation.

Transfer-pattern robustness

Eugen Sawin, Philip Stahl, Jonas Sternisko SS 2012, Projekt Homepage

This project extends to the development of an efficient route planner for public transportation networks, which is used to conduct experiments on the reliability of transfer patterns on dynamically modified networks.

Entitätserkennung für semantische Volltextsuche in medizinischen Fachartikeln

Jan Kelch, Masterprojekt, WS 2011/2012, Projekt Homepage

Die Grundlage dieses Projektes ist eine Sammlung von medizinischen Fachartikeln (ZBmed). ZBmed umfasst über 1.000.000 Artikel unterschiedlicher Journale. Das Ziel des Projektes war, die Texte der Fachartikel für die semantische Suchmaschine Broccoli aufzubereiten. Dafür müssen bestimmte Entitäten, welche von Broccoli berücksichtigt werden sollen, in den Texten markiert werden.

Transit Routing

Mirko Brodesser, Dirk Kienle, Thomas Liebetraut, Kyanoush Seyed Yahosseini, SS 2011, Projekt Homepage

While it is quite easy to find paths in street networks, incorporating transit data can be quite challenging. There are bus stops that have to be considered, but the bus does not stop all the time, so it may be better to walk. The right bus stops have to be found to use the right bus or metro and when changing trains, the algorithm should wait for an appropriate period of time so that the user does not miss the bus.

Implementation of a new algorithm for the fast intersection of unions of sorted lists

Zhongjie Cai, WS 2010/2011, Projekt Homepage

This master project is intended to implement a newly proposed fast intersection of unions of sorted lists using forward lists, presented in Efficient interactive Fuzzy Keyword Search¹, www 2009 Madrid Conference.

RNA - Google

Tuti Andriani, Thilu Chang, WS 2010/2011, Projekt Homepage

This master project is intended to implement a prototype for fast reach in laerge RNA repositories with three different algorithms which have different search goals.

Implementation of basic RNA structure prediction algorithms

Li Zhang, SS 2010, Projekt Homepage

In this project i implement two bioinformation algorithms for the RNA secondary structure prediction. One is Nussinov Algorithm and another one is Zuker Algorithm. The basic idea of Nussinov Algorithm is try to calculate the maximum based RNA pairs(A-U, C-G, U-G) within a given RNA string, and then use the maximum based pairs to traceback, and get the corresponding best RNA secondary structure. The basic idea of Zuker Algorithm is to calculate the minimum Gibbs Free Energy within a given RNA string, and use the minimum energy to traceback, and get the corresponding best RNA secondary structure.

Daphne

Axel Lehmann und Jens Hoffmann, Bachelorprojekt, 2010, Projekt Homepage

Daphne ist ein Online Verwaltungs- und Informationssystem für Kurse an Universitäten. Das System wurde als Datenbankanwendung auf Grundlage des Python Webframeworks Django implementiert.

IceCite

Ico Chichkov, Claudius Korzen, Eric Lacher, Bachelorprojekt, 2010, Projekt Homepage

PDFLibraryGui (GWT)

Das grafische Benutzerinterface ist mit dem Google Web Toolkit realisiert. Alle anderen Komponenten werden von diesem konnektiert. PDFLibraryGui stellt die gesamte Weboberfläche dar. Es bedient sich der anderen Komponenten um in indizierten Papern zu suchen, Treffer zu finden und Titel einem DBLP Eintrag zuzuorden.

InvertedIndex (Java)

Der InvertedIndex ist ein in Java geschriebenes Programm zur Indizierung der dblp.xml. Diese Komponente wird genutzt um bekannte Titel mit einem dblp Eintrag zu matchen, oder einen DBLP Eintrag für eine Zitierung zu finden.

Completesearch

Completesearch ist eine unter der Leitung von Frau Prof. Dr. Bast entwickelte universelle Suchmaschine. In diesem Projekt übernimmt sie die Suche in der DBLP Datenbank sowie in den nachträglich eingefügten Papern.

Movie Organizer

Mirko Brodesser, Bachelorprojekt 2010, Projekt Homepage

This site provides information about a Bachelor-Project which is called "MovieOrganizer".
It was written at the University of Freiburg by Mirko Brodesser and supervised by Prof. Hannah Bast.
It was developed from March-April 2010, as a parttime-project.
The goal was to write a program which allows the user to have a good overview over his movies
and to give him the possibility to find a subset of movies after his own criteria.

UniSearch

Johannes Schwenk, Diplomprojekt 2010, Projekt Homepage

Diese Arbeit präsentiert die Implementierung einer neuen Suchfunktion für die Webseiten der Universität Freiburg unter Verwendung der Software CompleteSearch als Backend. Dabei wird ein Plugin-System zum einfachen und schnellen Einbinden neuer Quellen in Python entwickelt, welches strukturierte XML-Daten aggregiert. Es wird ein generisches Plugin für Plone-Inhalte vorgestellt, welches das Hinzufügen weiterer Plone-Portale zum Suchindex zusätzlich vereinfacht. Zusammen mit weiteren Plugins für nicht Plone-basierte Quellen wird quellenübergreifende Suche ermöglicht.

Sektionen

Sektionen

Bachelor- und Masterprojekte

Improving String Function Semantics in QLever’s SPARQL Engine

An efficient external R-tree for very large datasets

Extension for QLever: Implementing missing SPARQL Expressions

A macro-benchmarking library for the QLever SPARQL engine

A Grid Operator's Dream: Error Detection with GNNs

Enhanced Visualization of Geospatial Data

Deep Knowledge Graph Question Answering

Cracking the Black Box of Graph Neural Networks for Electrical Grid Validation

Dynamic Observation and Interruption of SPARQL Queries

Optimizing GROUP BY in QLever using Hash Maps

Bill of Material Exploration Service

Introducing WiNERLi 2.0, an extension of WiNERLi

Spelling Error Detection using Deep Neural Networks

Semantic SPARQL Templates for Question Answering over Wikidata

xs grep: A GNU grep-like executable built with x-search

Predicting Ownership of Streets in OpenStreetMap Using Machine Learning

Transform PDF timetables into GTFS

Simplified Package Delivery App

Detection of Electric Vehicle Charging Events Using Non-Intrusive Load Monitoring

Question Answering on Wikidata

Testing and improving Antpower on the Simbench networks

Wikipedia Entity Linking and Coreference Resolution in C++

Project Map Matching Mobile Phones to Public Transit Vehicles

Projekt Public Transit Map Matching With GraphHopper

Enhancing find functionality of pdf.js

Segmentation of layout-based documents

Energy Price Forecasting

jSPINE - A Java Library implementing EEBUS SPINE

Spelling Correction and Autocompletion for Mobile Devices

Named Entity Disambiguation with BERT

Sentence Segmentation

Automated Generation of Rail Noise Maps

Simplke Question Answering on Wikidata

Tokenization repair using Transformers

Gantry

ClueWeb Entity Linking

POLUSA: A large Dataset of Political News Articles

Named Entity Recognition and Disambiguation

River Maps

Bitcoin Trading App

Circular Transit Maps

UniPal (A chatbot for the course catalogue of the Uni Freiburg)

Tokenization Repair Using Character-based Neural Language Models

WikiQuestions

Complete Search UI

Concept Neurons

Football Data Extraction for Broccoli

Search Engine for OSM Data

Tabular Information Extraction

ShapeExtraction

Suche mit regulären Ausdrücken

QLever SPARQL + TEXT UI

CompleteSearch

CompleteSearch does all the work on performing the search in the uploaded dataset. The web application (this project) serves as a middle layer, which processes and corrects the user input, and sends it to a separate local CompleteSearch server.

Pseudo Database for Existing XML Files

Completesearch

Deepdive

Efficient Code for (De)Compression

Lexical Semantics

A Mobile App for Kitchen Account Management

Automatic Recognition of Values in Wikepedia Articles

OSM Search

Entity-Component System

Wikification

Relation Extraction

Manual Feature Engineering with 3D Motion Capture Data

Spider Data Projector Control

Multicriteria Multimodal Routeplanning

Transfer-pattern robustness

Entitätserkennung für semantische Volltextsuche in medizinischen Fachartikeln

Transit Routing

Implementation of a new algorithm for the fast intersection of unions of sorted lists

RNA - Google

Implementation of basic RNA structure prediction algorithms

Daphne

IceCite

PDFLibraryGui (GWT)