Trinity College Dublin

Skip to main content.

Core links

1. Introduction

The information on this page applies to students taking final year projects, Year 5 dissertations, and M.Sc. dissertations in the School of Computer Science and Statistics under the following programmes:

  • BA (Mod) in Computer Science
  • BA (Mod) in Computer Science and Language
  • BA (Mod) in Computer Science and Business
  • BAI in Computer Engineering (D Stream)
  • BA Management Science and Information System Studies (MSISS)
  • BAI in Computer Engineering and Microelectronics (CD stream)
  • BA (Mod) Mathematics
  • Master in Computer Science (MCS)
  • Master in Computer Science (M.Sc.)
  • MAI in Computer Engineering

2. Guidelines for students

Important dates and deadlines for academic year 2018/19

Course Activity Date
Integrated Computer Science (Yr 4) Project Selection Fri 5th October 2018
Final Year Project Project Demonstration Period Mon 25th March - Fri 5th April 2019
  Project Presentation Material and Poster Submission Date Tues 23rd Apr 2019
  Project Report Due Tues 23rd Apr 2019
  Project Presentation and Poster Session Fri 26th Apr 2019
     
Integrated Computer Science (Yr 4) Internship Details Form Submission Date Wed 10th Oct 2019
Internship Goals Document Submission Date Fri 1st Feb 2019
  Poster Submission Date TBC
  Poster Presentation Date Wed 10th April 2019
  Mid Point Submission of Reflective Diary Mon 1st April 2019
  Technical Report Submission Date Tues 6th Aug 2019
  Final Submission of Reflective Diary Tues 6th Aug 2019
     
Master in Computer Science (Integrated, Yr. 5) Project Demonstration Period Tues 26th Mar - Thurs 28th Mar 2019
  Project Presentation Material and Poster Submission Date Fri 5th Apr 2019
  Project Presentation and Poster Session Tues 9th Apr 2019
  Dissertation Submission Date Fri 12th Apr 2019
     
Master in Computer Science (M.Sc.) Research Supervisor Confirmed TBC
  Research proposal written up and shared with supervisor for signing off
TBC
  Signed off research proposal submitted TBC
  Ethics applications deadline for any dissertation where a human study/trial is an integral part of the dissertation. TBC
  Project Demonstration Period Thurs 18th July - Fri 2nd Aug 2019
  Submission date for printed and bound copies of the dissertation Thurs 15th Aug 2019
     
Computer Engineering (Yr 4) CS4E2/CE4E2 Ethics Clearance Application Deadline TBC
  Project Demonstration Period Mon 25th Mar - Fri 5th Apr 2019
Final Year Project Project Report Due Fri 12th Apr 2019
     
Computer Engineering (Yr 4) Internship Details Form Submission Date Wed 10th Oct 2019
Internship Goals Document Submission Date Fri 1st Feb 2019
  Poster Submission Date TBC
  Poster Presentation Date Wed 10th April 2019
  Mid Point Submission of Reflective Diary Mon 1st April 2019
  Technical Report Submission Date Tues 6th Aug 2019
  Final Submission of Reflective Diary Tues 6th Aug 2019
     
Master in Computer Engineering CS5E2 Research Methods - Preparation of a Research Proposal (25%) TBC
  CS5E2 Research Methods - Presentation of Research Proposal (5%) TBC
  CS5E1 Ethics Clearance Application Deadline TCB
  CS5E2 Assignment on Experiment Design (30%) TBC
  CS5E1 Interim Report Due (5%) TBC
  CS5E2 Research Methods - A short discussion on research ethics related to CS5E1 (10%) TBC
  CS5E1 Marked Project Demonstration/Presentation Period (20%) Mon 25th Mar - Fri 5th Apr 2019
  CS5E1 Dissertation Submission Date (75%) Fri 12th Apr 2019
  CS5E2 Research Methods - Research paper submission (30%) TBC
     
Management Science and Information System Studies Interim Presentations Tues 20th Nov - Fri 23rd Nov 2019
  Project Report Due Fri 29th March 2019
     
Computer Science & Business Project Demonstration Period Mon 25th Mar - Fri 5th Apr 2019
  Project Report Due Fri 12th Apr 2019
     
Computer Science Linguistics and Language Project Demonstration Period Mon 25th Mar - Fri 5th Apr 2019
  Project Report Due Fri 12th Apr 2019

 

* Due to scheduling constraints it may be necessary to hold some demonstrations later in the week.

When to choose a project

An initial list of project proposals (from lecturing staff) will be released on the Thursday of the last week of Semester 2 in your Junior Sophister year. Supervisors will not accept supervision requests before this time. Further project proposals may be added to this list by lecturing staff over the summer vacation.

Students should select a final year project before the end of the third week of Semester 1. Where students have not selected a project by the deadline, a project supervisor will be allocated to them in consulation with the relevant course director out of the supervisors who have not yet reached their supervision limits. The chosen supervisor will assign the student a project or help them to specify a project in an area selected by the supervisor.


How to choose a project

Students may either

  • select a project from the list of project proposals put forward by the lecturing staff, or
  • alternatively propose their own projects. If you have a project proposal of your own and if you are having trouble finding an appropriate supervisor, then contact your course director:


In either case students must get the agreement of a supervisor before they will be considered as having selected a project. Supervisors may require a meeting with the student to discuss the project before accepting a supervision request. Once a supervisor agrees to supervise a project, details of the project assignment will be recorded centrally by the supervisor.

Students may only select a single project, but they may change their minds and select an alternative project before the end of the third week of Semester 1. However, if a student selects a new project, they must notify both the old and new supervisors that their previously chosen project is to be cancelled.


Choosing a project supervisor

Students should note that each supervisor will only take a limited number of students. If you find the information is incorrect please send details to Final.Year.Project.Coordinator@scss.tcd.ie

Students should also note that there are only a limited number of supervisors in any area. Hence students are not guaranteed a project in their area of choice.


Project demonstrations and reports

See the following documents:


Notice: Trying to get property of non-object in /srv/scss/StudentProjects/index.php on line 570 Notice: Trying to get property of non-object in /srv/scss/StudentProjects/index.php on line 570 Notice: Trying to get property of non-object in /srv/scss/StudentProjects/index.php on line 570

 

3. Supervisors' project areas

The following table indicates the broad areas within which projects are generally supervised, together with the potential supervisors in these areas. Each name is linked to a list of projects proposed by that lecturer.

Subject AreaSupervisors willing to supervise projects in this area
Artificial Intelligence Michael Brady, Vincent Wade, Martin Emms, Tim Fernando, Rozenn Dahyot, Carl Vogel, Khurshid Ahmad, Alfredo Maldonado Guerra, Ivana Dusparic, Joeran Beel, Majid Latifi
Computational Linguistics Martin Emms, Tim Fernando, Carl Vogel, Khurshid Ahmad, Alfredo Maldonado Guerra
Computer Architecture Jeremy Jones, David Gregg, Michael Manzke, John Waldron, Jonathan Dukes
Computer Vision Kenneth Dawson-Howe, Gerard Lacey, Iman Zolanvari
Distributed Systems Stefan Weber, Mads Haahr, Dave Lewis, Jonathan Dukes, Melanie Bouroche, Siobhan Clarke, Ivana Dusparic
Foundations and Methods Hugh Gibbons, Andrew Butterfield, Glenn Strong, Tim Fernando, Vasileios Koutavas
Graphics, Vision and Visualisation Kenneth Dawson-Howe, Fergal Shevlin, Gerard Lacey, Michael Manzke, John Dingliana, Carol O'Sullivan, Rozenn Dahyot, Khurshid Ahmad, Rachel McDonnell, Aljosa Smolic
Health Informatics Lucy Hederman, Gaye Stephens, Mary Sharp, Joeran Beel
Information Systems Mary Sharp, Joeran Beel
Instructional Technology Brendan Tangney, Gerard Lacey, Mary Sharp, Glenn Strong, Richard Millwood
Knowledge and Data Engineering Vincent Wade, Lucy Hederman, Mary Sharp, Declan O'Sullivan, Dave Lewis, Owen Conlan, Khurshid Ahmad, Rob Brennan, Seamus Lawless, Kris McGlinn, Kevin Koidl, Joeran Beel
Networks and Telecommunications Hitesh Tewari, Stefan Weber, Eamonn O'Nuallain, Meriel Huggard, Ciaran McGoldrick, Jonathan Dukes, Stephen Farrell, Melanie Bouroche, Marco Ruffini, Douglas Leith, Lory Kehoe, Georgios Iosifidis
Other David Abrahamson, Michael Brady, Stephen Barrett, Melanie Bouroche, Marco Ruffini, Vasileios Koutavas, Douglas Leith, Joeran Beel
Statistics Mary Sharp, Rozenn Dahyot, John Haslett, Simon Wilson, Brett Houlding, Jason Wyse, Arthur White, Douglas Leith, Bernardo Nipoti, Mimi Zhang


4. Project proposals for the academic year 2018/19

The following is a list of suggested projects for final year BA (CS), BA (CSLL), BA (CS&B /B&C), BAI, MAI, MCS, M.Sc., and MSISS students for the current year. Note that this list is subject to continuous update. If you are interested in a particular project you should contact the member of staff under whose name it appears.

This is not an exhaustive list and many of the projects proposed can be adapted to suit individual students.


Dr. Arthur White

Updated 29/09/17. email: arwhite@tcd.ie or phone +1062. I am based in room 144, Lloyd Institute.

I am interested in problems in computational statistics, where we use algorithms to infer the parameters of a model. The following project areas are suitable for MSc students taking the Data Science strand. I'm afraid that I'm unable to supervise final year undergraduate students this year. In all cases a good working knowledge of statistical methods, e.g., maximum likelihood estimation, Bayesian inference, and Monte Carlo methods will be helpful, and a general interest in statistics will be essential. Each project will be expected to involve:

  • A review of methodology in the area.
  • Implementing an inference routine for the model, probably using R.
  • Applying the model to data in a detailed analysis.


Scalable clustering methods

Mode-based approaches are a popular way to perform clustering in a principled and coherent framework. The standard approach to clustering involves running an iterative algorithm that computes summary statistics using the entire dataset at every iteration. In this project we would investigate alternative approaches that locally re-assign observations to different clusters. There is scope to parallelise elements of this algorithm, or to cluster only a subset of the data at a single iteration. This would make it possible to scale up the clustering method to much larger datasets.

Social network analysis PROJECT NOW TAKEN

Social network analysis involves studying the relationships between a set of objects. In many situations, there are patterns to the types of relationships that are formed - for example, communities of people who are more likely to link to each other than to other people in the network, and leader/follower dynamics. The stochastic blockmodel is a popular statistical method for detecting these patterns. The project would involve investigating novel several areas of interest, including overlapping community detection, degree corrected blockmodels, or non-binary edges for example, looking at email exchanges between users. Reference: Arthur White and Thomas B. Murphy, "Mixed-Membership of Experts Stochastic Blockmodel" Network Science, Volume 4, Issue 1 March 2016, pp. 48-80

Clustering with distal outcomes

A recent area of research involves using the output of a clustering method as a predictor variable for a regression. For example, we cluster students by study habits, then use the clusters to predict their module grade. Estimation for such methods is fundamentally divided into two steps: 1) performing the clustering, and 2) performing the regression. For statistical inference to be valid, the second step of the estimation process has to take into account the estimation uncertainty of the first step. The project would involve investigating new approaches to valid inference for this problem. Reference: Stephanie T. Lanza, Xianming Tan, and Bethany C. Bray: "Latent Class Analysis With Distal Outcomes: A Flexible Model-Based Approach" Struct Equ Modeling. 2013 Jan; 20(1): 1-26.

Probabilistic record linkage

Data linkage is the activity of matching data from multiple sources that correspond to the same individual. As more and more sources of data become available, this activity has increasingly popular. The goal of this project will be to investigate statistical approaches to record linkage, so that even when imperfect matches pccur between data sources, the uncertainty surrounding a match can be quantified. The project will apply these methods to data in the AVERT programme. Reference: Rebecca C. Steorts, Rob Hall, and Stephen E. Fienberg: "A Bayesian Approach to Graphical Record Linkage and De-duplication" Journal of the American Statistical Association Volume 111, 2016 - Issue 516.

Dr. Joeran Beel

Position: Ussher Assistant Professor in Intelligent Systems
Contact: Please visit our WIKI for details on how to apply for a FYP or dissertation.
Last update: 2018-08-01

Joeran Beel and his team are part of the ADAPT Research Centre as well as of the Knowledge and Data Engineering Group (KDEG) of the Intelligent Systems Discipline at the School of Computer Science and Statistics at Trinity College Dublin. Our work focuses on machine learning, text mining, natural language processing, the blockchain and other technologies, in areas including recommender systems, search engines, news analysis, plagiarism detection, and machine translation. Domains we are particularly interested in include digital libraries, digital humanities, open science, eHealth, tourism, law, fintech, and mobility. For more details see our research areas, projects, publications, and industry partners.

We have around 50 project ideas for FYP and dissertations. They are listed and maintained in our WIKI. The following list provides only a small excerpt:

  • "Pimp That Voice" (Eliminate Annoying Accents from Audio/Video)
  • "Outcry Or Not" (predict if a tweet will cause a public outcry)
  • Stable Neural Turing Machines
  • Sketcha: Captcha based on Drawings
  • Stereotype.me (Demonstrate the potential bias of Machine Learning Algorithms)
  • "Stability" as RecSys/ML Evaluation Metric
  • Considering "Time" in Recommender-Systems Evaluation
  • The effect of Dataset Pruning on Recommender-Systems Evaluation
  • Negative User Modelling: Utilizing documents that are usually considered as not relevant for user modeling
  • Virtual Citation Proximity: Use Citation-Ground Truth to Train Text-Based Machine Learning
  • "Nobel Prize or Not?" (Academic Career/Performance Prediction)
  • Machine Translations for Multi-Lingual Content-Based Filtering
  • Recommendation Persistence: How often should the same recommendations be shown to a user?
  • Entity Embeddings (Hybrid Flexible or Multi Emeddings)
  • ML-Augmented Datasets for Improving Recommender Systems Performance
  • Disjunctive Union Evaluation as Alternative to Interleaving and Classic A/B Test
  • "Heart Rate Variability" as Implicit Ratings in Recommender Systems
  • Time-normalized TF-IDF (TF-IDF)t: A novel term weighting scheme to enhance recommendation effectiveness
  • "Sequences-of-Bags" Learning
  • TF-IDrF: A novel Term-Weighting Scheme based on Inverse Recommended-Document Frequency
  • "ASEO Me" (Optimize Research Articles for Academic Search Engines)
  • The Cryptocurrency Donation Calculator
  • ... and several projects more

If you have your own idea relating to one of our research fields (particularly recommender systems, machine learning, machine translation, text mining, natural language processing, blockchain, ...), we would also be happy to hear about it. Continue reading in our WIKI.

Dr Mélanie Bouroche

I am happy to supervise projects in the Smart Cities area, if you have any idea that might make our cities smarter, get in touch! I am particularly interested in connected autonomous cars and their effect on cities, addressing questions such as how can such smart cars share the road with human-driven cars? What proportion of them is needed to make traffic safer and more efficient for everybody? While those are big research questions, a number of projects can be carved out depending on students' specific skills and interests.

Using Autonomous Driving to improve traffic safety and efficiency

The deployement of autonomous vehicles on our road provide us with a unique opportunity to improve the the safety and efficiency of the overall traffic flow (composed of both human-driven vehicles and autonomous vehicle). Indeed, autonomous vehicles can dampen the oscillations created by human-driven vehicles, thereby leading to more stable traffic flows. This project will investigate the design of appropriate controllers for autonomous vehicles to achieve this and test those on a simulated platform.

Coordinating Connected Autonomous vehicles to improve traffic safety and efficiency

This project will investigate how vehicles can communicate with each other to coordinate their behaviour in a traffic composed of both autonomous and human-driven vehicles.

Privacy-aware travel assistant

Current journey planners only provide information before the start of a journey. This project will investigate how travellers can be supported during their travels, proactively notifying them if they should update their travel plans because of delays disruptions. A key aspect is to ensure that the privacy of the traveller is maintained.

Dr Mike Brady

Traffic Analysis from Advanced Bus Transportation Systems (ABTS)

    This project would be to take publicly available datasets generated from ABTS and possibly elsewhere to estimate where congestion and delays might be occurring.

Open Street Map Contribution.

    This project would be to condition, process and upload good-quality road grade information to publicly-available maps

Open Street Map Contribution.

    Dashboard-type Visualisation for Advanced Bus Transportation Systems (ABTS)-related data.

Dr. Andrew Butterfield

Room

Extension

ORI G.39

2517

Project ideas for 2018/19.

First draft giving basic outline of projects. Links to background material to follow (Revised 6th Sep 2018).

Projects suitable for 4th-year projects are marked with [B], while those suitable for 5th-year dissertations are marked with [M]. Some project ideas can be easily scoped for both and are marked [B/M].

The first collection of projects involve the use of the pure, lazy, functional programming language Haskell:

  • Using Haskell to develop domain-specific languages (DSLs) similar to the ideas described in the financial combinators paper [B/M]. DSLs could describe other kinds of contracts, including smart contracts for blockchains (with Hitesh Tewari) [B/M]. Also of interest would be DSLs for non-financial/legal domains, such as software requirements capture, or describing flows in software-defined networking (SDN) [B/M].
  • A theorem prover, written in Haskell, is currently under development, with a command-line/REPL style user interface. On Unix/OS X systems, unicode (UTF-8 encoded) is used to display mathematical symbols, and ANSI escape sequences are used to highlight, colour, and transform text. Projects are available to explore user-facing enhancements, such as:
    • Extending the display of unicode mathematical symbols, and the ANSI escape sequences, to Windows. along Windows unicode is based mainly on UTF-16 and UTF-32 encodings, and the Windows "Cmd" terminal does not natively support ANSI escape sequences [B].
    • Adding a graphical user-interface (GUI) to the prover. This could be based on a previous final-year project that explored an improved GUI for an earlier implementation of the prover. This focussed on the open-source threepennygui package and the Electron web-browser [B/M].
    • Adding high-quality pretty-printing and formula parsing to the prover [B/M].
    • Using the prover to build a useful library of theories. This would also provide feedback regarding the user-facing aspects above [B/M].
    • Extending the prover by adding in proof automation functions try various strategies to find proofs, or make progress toward one. [B/M].
  • Theory hacking in Haskell (intrigued? get in touch) [B/M].
  • The model-checking tool FDR4 has a front-end parser, for a language known as "machine-readable CSP" (CSPm) that is written in Haskell, and is open-source. This project is to upgrade the parser to handle CSPm source that is embedded within LaTeX files, to produce so-called "literate CSPm" [B].

The second collection of projects do not require Haskell programming (although it may be used, if preferred.)

  • Using the Process Model Language (PML) and tools to model and analyse business processes and workflow, ranging over such diverse application areas such as financial operations, software development processes, clinical healthcare pathways and medical diagnosis [B].
  • Extend the PML tools to provide new analyses based on new semantic models developed in our research. The original PML tool was written in C, but parsers for Haskell and Java, to name but a few, are now available [B/M].
  • Continuing work exploring the use of the CSP language (Communicating Sequential Processes) and the FDR4 tool to capture models of requirements for a operating system "separation kernel" to be used for spacecraft [B/M].
  • As per the above project, but using the Circus language and a recently developed "Circus to CSPm" translator to provide FDR4 input. [B/M].

Dr Ciarán Mc Goldrick

URL: www.scss.tcd.ie/Ciaran.McGoldrick :: www.ciaranmcgoldrick.net

Room

Extension

Lloyd 1.10a (inside 1.11)

3626

I am happy to supervise projects at both senior undergraduate and MSc level. In recent years I have predominantly been mandated to supervise MSc projects.

In general you should have a strong academic record, an interest in networking, communications and control/signal processing, be motivated to succeed and solve problems, be a solid programmer and have some affinity for hardware.

I will be a variety of projects in 2017/18, some of which may include the opportunity to collabrate with colleagues in UCLA.

Vehicular Communications

I will be offering two project on vehicular communications.
One will be a continuation of a project on the use of Visible Light Communications and associated systems as a side channel for secure V2V and V2I communications. This project will involve the evolution and development of existing hardware circuits and control systems.
A second project will focus on efficient, low-loss medium switching in response to rapidly changing and dynamic vehicular mobility.

Underwater Networking

I will be offering two projects on Underwater Networking.
Both will involve (contribution to) the development of a community accessible undewater networking test and evaluation platform. There will be two separate development, integration and practical evaluation strands that will complement activites in our H2020 project.

Control

I will be offering a project involving the development and evolution of a new form of primitives for use in distributed control modalities.

Your project ideas ...

If you have an interesting or compelling idea in the networking, communications, security,control or STEM education domains please feel free to get in touch. In doing so please be able to clearly and concisely tell me: i) what you propose to do; ii) why you want to do it; iii) what the interesting (research) hypothesis is; iv) how or why anyone be interested in your completed project.

Further info: Ciaran Mc Goldrick

Last updated: 12/7/2017

Prof. Rozenn Dahyot

    My areas of interest are in robust statistics, statistical inference, nonparametric statistics, pattern detection & recognition, forecasting, tracking, computer vision, signal processing, computer graphics. If you have any interest in applied statistics and mathematics in computer vision problems, feel free to contact me (at Rozenn.Dahyot@scss.tcd.ie).
    More specific projects on offer (FYP or MSc):
  • Application of deep learning to applications such as 3D reconstruction from images, image super-resolution, image recoloring, image registration, source separation, audio classification.
  • Learning from random functions (e.g. functional data analysis, FPCA)
  • Investigation of Optimal transport, Copula and Information theory (with application in image color tranfer and shape registration)
  • GIS visualisation for spatiotemporal model rendering: investigating software tools such as QGIS adn Unreal game engine as mapping tools of GPS tagged data.

Prof Dave Lewis

email with "PROJECT IDEA" in the subject line.

Privacy Canvas:

The Business Model Canvas is a popular tool for the iterative modeling of business ideas. Recently we have adopted the affordances of the business model canvas (simplicity, graphical layout, ease of design iteration) to the problem of modelling the ethical issues of an digitial application project. This has resulted in the Ethics Canvas design and application, which has been used to help teach ethics considerations at undergrad, postgrad and postdoc levels. A similar tool may be useful when considering and teaching privacy and data protection concerns. This project will refactor or redesign the ethics canvas code to offer a canvas style interface for brainstorming the data protection issues in a digital application design, in a way suitable for supporting training in this topic in remote groups.

Multilingual GDPR annotator:

With multiple approaches emerging to support compliance to the EU’s new General Data Protection Regulation, supporting the linking of different privacy policy or privacy impact assessment documents back to the GDPR source text becomes of interest to those needing to demonstrate compliance. This project will provide web annotation tool support for linking GDPR text with organisation specific data protection documents, and enable this for different languages. This could then be used for other regulations or standards requiring compliance tracking internationally. The approach should follow a standardized web annotation approach and should build on the linked data representation of the GDPR text developed in the school. This project would suit a student with strong language skills in a European language in addition to English.

Generic Data Management Artefact Meta-Management:

Data management is becoming an increasingly complex and vital part of any organisation attempting to leverage big data assets. Declarative data objects using standard vocabularies and data manipulation languages provide powerful data management features, but as they become popular these objects themselves must be managed over their useful lifecycle, so they can be indexed, discovered, revised, corrected etc. This project will explore open vocabularies and tools to provide support for such lifecycle management over a small sample set of artefacts, namely, semantic mapping in SPARQL and it explicit representation in SPIN, data uplift mapping in R2RML, data protection compliance queries in SPARQL/SPIN.

Open Data for Research Ethics:

Research ethics clearance needs to be secured for scientific studies within research institutes, but the details and provenance of such data is typically not available if experimental data is later shared with other researchers. This project will explore a linked open data vocabulary to complement existing open science data models (e.g. that of OpenAire) to allow the ethic clearance associated with that data to be recorded and shared in an interoperable manner between research institutes via an open API.

Asserting Collective Control over the Means of Cognition:

Big web-based companies, often referred to as digital ‘platforms’, are able to leverage personal data on a massive scale for use in targeted advertising and other opaque behavioural influencing activities. Modern machine learning techniques lead to a massive information asymmetry between user and such companies, i.e asymmetry between what they know about us and what we know about how they leverage, share and use our data. While data protection regulation aim to redress this balance, it only operates at the level of the rights of individuals, so this power asymmetry may not be greatly impacted for the population of users overall. This project will explore ways in which social media groups can be used to share concerns about the aggregation, sharing and processing of personal data and to organise collective action around these concerns, upto and including mass transfer of personal data to another platform. Tools to enable mass, collectively organised transfer of data to another platform can exploit both the enhanced right to portability users now enjoy and interoperability standards from the World Wide Web Consortium’s working group on the Social Web.

Digital Ethics Observatory:

News stories about Big Data and AI ethics appear in the media daily. However, there are few resources available for those wishing to monitor these fast moving issues. This project will develop an application that allow news stories to be archived and then annotated by interested volunteers using the ethics canvas tool (ethicscanvas.org), to provide an open, searchable index of digital ethics news stories for researchers, journalists and concerned citizens alike.

Data Protection Process Browser Widget:

The EU’s new General Data Protection Regulation offers users across EU common rights on how their data is processed by organisations. This project will develop and evaluate a web widget that can be integrated into different web sites and offer a simple graphical, process-oriented visualization for exploring the rights offered by a specific service’s privacy policy, based on an existing model developed in the ADAPT Centre for Digital Content Research.

Blockchain for Value-chain Consent Management:

The EU’s new General Data Protection Regulation offer users right to rectify or erase data previously provided to a service provider. Response to requests that exercise this right must be propagated to any other organisations with whom that user’s data has been shared and its implementation must be recorded for regulatory compliance purposes. This potentially adds significant complexity to systems for sharing data along business value chains. This project will explore the level to which existing blockchain platforms can reduce this complexity and the cost involved, especially in order to mitigate the risk of this regulation becoming an excessive burden on small to medium enterprises data sharing.

Visualising provenance for data and consent lifecycles:

The upcoming General Data Protection Regulation requires companies and organisations to maintain a record of the user’s compliance and data lifecycles. These lifecycles can be complex as the same consent and data can be used in several activities which makes it difficult to track their usage. Visualisations are a great way to display information in a concise and simpler manner, and can prove to be helpful in navigating complex pathways such as the lifecycles. The project explores various ways to visualise provenance traces in a granular manner so as to enable tracing data and consent from an user to all the activities that use it, based on an existing model developed in the ADAPT Centre for Digital Content Research.

Integration of Building Data Sets in a Geospatial Context:

Currently, building information is often dispersed and fragmented across different storage devices, different file formats and different schemas. This data must be integrated in order to support a range of use cases relevant to smart buildings and cities, such as those related to navigation, building control and energy efficiency. In this project you will explore available standards and data sets, and using established methodologies for data uplift, convert these datasets into Linked Data, making them available over the web and linking them to other available data sets, such as geospatial data. You will answer the question, can existing open datasets be used to derive useful information about buildings to support the aforementioned use cases.

Exploratory technologies for supporting data uplift - https://opengogs.adaptcentre.ie/debruync/r2rml

Conversion of building information geometry into geospatial geometric data:

The Industry Foundation Classes (IFC) is a standard for exchanging building information. Currently, a large part of the standard is dedicated to storing and exchanging geometric data about the geometry of the building and building elements. A complex set of relations are maintained within the IFC schema to support geometry, which when converted to RDF leads to significant overhead in terms of storage as triples. In this project you will explore methods for reducing the size of the geometry of IFC models, in particular, through their conversion to Geographical Information Systems standards such as Well Known Text answering the question, are GIS geometry models a suitable way to store building geometries.

Exploratory technologies for working with IFC geometry (removes geometry from an IFC OWL conversion - https://github.com/pipauwel/IFCtoSimpleBIM)

Visualisation of building geometry in a geospatial context:

Open and accessible building information can support multiple use cases relevant to smart buildings and cities. The OSi has a large dataset of building data, which includes geospatial data about building location, and other properties like its current use, the type of building (its form and function). In this project you will explore an interface for the visualization of the OSi building data to support the querying of buildings, but also interaction with the building geometry through a web interface, e.g. point and click selection of buildings (with HTML5 and the three.js WebGL library). You will examine what is an appropriate way to visualise building data so that it can support users when generating and exploring queries. Exploratory technologies for visualising building information, is available that shows a very simple three.js GIS model which integrates OSi county data.

Online questionnaire tool for GDPR compliance assessment:

The General Data Protection Regulation (GDPR), agreed upon by the European Parliament and Council in April 2016, will replace the EU Data Protection Directive (EU DPD). Organizations dealing with personal data of EU citizens must ensure that they’re compliant with the new requirements of the GDPR before it becomes effective on 2018. It is important for the organization dealing with personal data to assess their compliance with GDPR to identify risks before regulatory violations occur, as the fines under GDPR can be upto 4% of a company's global turnover. This project will build a online support tool for GDPR compliance based on assessment questionnaires. The tool will show the important aspects of GDPR and based on answers to the questions of compliance assessment, the tool will show whether they are fully compliant or they need to work in that area to improve compliance.

Prof. David Gregg

Room

Extension

130 Lloyd Institute

3693

To take any of my projects you will need good programming skills.

Note that I am open to supervising both masters and final year projects. Many of the projects below could be taken as either a BA final year project or masters in computer science project. Clearly, the masters version of the project would involve more mork than the bachelors version.

Sparse data structures and algorithms for convolution layers in deep neural networks

Suitable as BA final year project or MCS masters project.

Deep neural networks (DNNs) are among the most successful machine learning technologies. They are midely used in tasks such as recognizing and classifying objects in images. DNNs are most useful when they can be deployed in the field at the source of image, sound and other data, rather than in data centres. However, DNNs also require very large amounts of data and computation.

One way to reduce the resource requirements of DNNs is to prune the trained weights of the DNN. DNNs are trained by slowly modifying weights stored in large matrices (multidimensional arrays). It has been observed that many of the weights can be replaces with zero values without affecting the accuracy of the DNN. Matrices with a large number of zero entries are known as sparse matrices. Using an appropriate data structure and algorithms, sparse matrices with many zeros can use much less computation and memory than normal arrays.

The goal of this project is to investigate different data structures and algorithms for the convolution layers in deep neural networks. This will involve learning about and implementing existing known sparse matrix representations such as compressed sparse row (CSR) and block sparse row (BSR), as well as developing abd implementing new data structures and different approaches to performing sparse convolution for DNNs. The most obvious programming language to use for this project is C/C++ with x86 or ARM vector SIMD intrinsics, although other languages such as CUDA are possible.

Handwriting recognition with deep neural networks [TAKEN]

Suitable as MCS/MAI masters project for a motivated strong student. Not suitable as BA(Mod)/BAI project.

In recent years various kinds of artificial neural networks have been applied to problems that are traditionally difficult for computers, such as image recognition. Some of these kinds of networks, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs) have achieved levels of accuracy that are far higher than previous machine-learning approaches. These kinds of "deep" neural networks have also been applied to the difficult problem of off-line handwriting recognition, that is turning an image of handwritten text into the corresponding sequence of characters. The results have been quite successful for langauges as varied as English and Arabic.

The goal of this project is to adopt an existing approach to English language hardwriting recognition and adapt it to recognizing handwritten text in Irish using the traditional Irish script. The National Folklore Collection (https://www.ucd.ie/irishfolklore/en/) contains large numbers of documents handwritten in Irish script in the 1930s. Many of these documents have been scanned, and a large subset have been transcribed into readable text. These images and transcriptions are available on the website www.duchas.ie.

For the project you should take an existing design and/or implementation of a pipeline for processing pages of handwriting and using deep learning handwriting recognition to turn the image of the handwriting into ASCII text. An important stage of the process will be gathering a training set of sample pages and text transcriptions from the duchas.ie website. Unfortunately the website does not make the data availble in a format that is easy to process automatically, so gathering the training dataset will involve some work. (I've asked, and they do not make the data available in a more machine-friendly format). Using the training dataset, you will then train the neural network to recognize handwritten Irish text and output ASCII textt.

This project will require background reading and creativity. You will need to allow time to gather datasets and train the neural network. It is therefore suitable only as an MCS/MAI project. There is not enough time for it to be taken as a BA(Mod) or BAI project. Some basic knowledge of the Irish language will be helpful for doing this project.

Sparse data structures and algorithms for fully-connected layers in deep neural networks

Suitable as BA final year project or MCS masters project.

Deep neural networks (DNNs) are among the most successful machine learning technologies. They are midely used in tasks such as recognizing and classifying objects in images. DNNs are most useful when they can be deployed in the field at the source of image, sound and other data, rather than in data centres. However, DNNs also require very large amounts of data and computation.

One way to reduce the resource requirements of DNNs is to prune the trained weights of the DNN. DNNs are trained by slowly modifying weights stored in large matrices (multidimensional arrays). It has been observed that many of the weights can be replaces with zero values without affecting the accuracy of the DNN. Matrices with a large number of zero entries are known as sparse matrices. Using an appropriate data structure and algorithms, sparse matrices with many zeros can use much less computation and memory than normal arrays.

The goal of this project is to investigate different data structures and algorithms for the fully-connected layers in deep neural networks. This will involve learning about and implementing existing known sparse matrix representations such as compressed sparse row (CSR) and block sparse row (BSR), as well as developing abd implementing new data structures and different approaches to performing sparse convolution for DNNs. The most obvious programming language to use for this project is C/C++ with x86 or ARM vector SIMD intrinsics, although other languages such as CUDA are possible.

Dr. John Dingliana

https://www.cs.tcd.ie/John.Dingliana/

Room

Extension

02-014 Stack B

+3680


Some general info

I am a member of the graphics, vision and visualisation research group interested in the areas of:

  • 3D Visualisation
  • Computer Graphics and Virtual Reality
  • Graphical aspects of Mixed and Augmented Reality
  • Stylised Rendering / Non photo-realistic rendering
  • Physically-based Animation

Suggested Projects:

  1. Point-based Rendering on Virtual Reality/Augmented Reality displays: Video and 3D information will be visualized on a Head Mounted Display in the form of a Point Cloud (points in 3D space with colours). Data will initially be taken from static (pre-captured) datasets but the objective would be eventually to renderd live captured data.
    Potential challenges include the following (each of which could be the focus of a project):
    • ensuring accuracy/fidelity of the visualization
    • reducing latency, improving performance
    • adaptive level of detail
    • progressive rendering
  2. Augmented human vision. [This is an implementation project. Best suited for a 4th Year FYP but could be extended for an MSc Dissertation]: The objective of this project is to address some of the challenges of merging virtual graphical objects with dynamic real-world objects to provide information about the object to the user in augmented reality (AR). Microsoft ran a competition for serious applications proposals for the Hololens AR display and some of the winners are listed [HERE]. The question is how can information be effectively and aesthetically displayed in such applications. PLEASE NOTE: This is mainly a graphics project; we are not interested in accurate sensor data (this will be simple or largely simulated), but instead in interesting and seamless ways of blending 3D information onto the realworld. Rendering must be done interactively and in real-time.
  3. Spatial perception in AR : This project will explore how users perceive relative distances of objects (e.g. real vs virtual) in mixed environments. Can users reliably judge which object or feature is closer, do users have an accurate sense of scale, can users be convinced that a real and virtual object are collocated/connected? In particular there is limited work in up-and-coming "see-through AR" devices such as the Microsoft Hololens.
    The effort in the project will be in using one or more AR displays to render experimental 3D graphical scenes wherin virtual objects are embedded in the real world; implementing a number of strategies (mostly from existing literature) to improve spatial perception in such scenes; implementing a testing scenario to compare spatial percpetion from different strategies; and potentially running a pilot experiment.
  4. Spatial perception in games [This is an implementation project, best suited for a 4th Year FYP but could be extended for an MSc Dissertation]: The objective of the project is to implement a simple game or several mini-games to test how different rendering styles and display techniques affect user performance at spatial perception tasks. Some simple examples are 3D versions of the classic games Pong or Breakout. Many variants of these have been implemented but a major challenge for the user is accurately guaging how far away an object is supposed to be in the z-direction (coming out/ going into the screen). Proposed solutions to enhance a sense of depth include shadows, focal blur with distance (depth-of-field), stereoscopy, size, motion, parallax etc.
    Pre-requisities: students must have taken (or be in the process of taking) a computer graphics module or have some experience in 3D graphics programming.
  5. Topics in Visualization: I am interested in visualization of spatialized 2D and 3D data structures (i.e. data that has a geometric structure). Some possible topics:
    • Multi-modal spatialized data visualization e.g. fused visualization of data from different sources
    • Multi-variate visualization e.g. visualizing a vector field with many variables.
    • Time-varying spatialized data visualization
    • Spatio-temporal visualization
    • Visualization using Virtual and Augmented Reality devices
    • Perception in Visualization
  6. Image editing tool for very large multi-layered images. [This is an implementation project suitable for a 4th Year Final Year Project only. There may be limited technical novelty for an MSc/5th Year project.] The objective of this project is to develop a tool for loading, exploring and editing very large multi-layered images such as obtained from a high resolution Scanning Electron Microscopes (SEMs). The software should have similar basic functions typical to photo editing applications such as Photoshop including: loading, saving of images, translation, rotation, selection, enhancement etc. In addition, it should support adaptive exploration such as zoom or pan a subset of the image and provide functions for dealing with multi-layered images e.g. navigating and compositing layers. The main challenge will be that some images are too big to even hold in memory so the image may need to be broken down into subparts loaded on demand but appears as one large image to the user. The users should be provided of some abstracted/global view of the image as a whole to aid navigation.
  7. Multi-user augmented reality. This project investigates multi-user shared experiences in augmented reality. The first component of the project is to implement an interactive 3D AR demo (ideally a small game) that can be experienced or played by at least two users. The demo/game mechanics should be developed iteratively with consultation with the supervisor but must include some element of 3D spatial positioning and real-time interaction. The second will be to analyze and optimize the efficacy of the shared experience e.g. do all players have the same experience of the demo, are their interactions equally effective, is there view of the position and state of objects the same?

Dr. Ivana Dusparic


I am open to supervising projects developing novel artificial intelligence techniques and/or applying these techniques in intelligent urban mobility and smart cities in general.

In particular, I am interested in learning-based agents and multi-agent systems, with particular focus on reinforcement learning, including deep reinforcement learning, transfer learning, lifelong learning, credit assignment, multi-agent collaboration and self-aware systems . I am also interested in applying these techniques to emerging urban transport models and their impact on cities, eg intelligent urban traffic control, car sharing, ride sharing, mobility as a service, multi-modal travel planning, learning-based personalization of travel, impact of autonomous vehicles on traffic patterns etc.

I am also open to proposals applying learning-based multi-agent techniques to management and optimization of other large-scale infrastructures - if you have an interest in learning-based optimization and have an application area in mind, let me know!

Some specific projects are below, but also feel free to propose your own.

Learning:

Lifelong reinforcement learning
Lifelong Machine Learning is a technique that enables an agent to learn continuously, accumulate the knowledge learned in previous tasks, and use it to help future learning. However, most machine learning techniques currently focus on learning a single-task. This project will investigate lifelong learning techniques that enable accummulation and reuse of knowledge from previously encountered individual tasks in autonomous driving scenario - agents need to detect similarily between situations/contexts encountered and draw on previous knowledge to bootstrap learning for the new road conditions.

Real-time adaptation of reinforcement learning
Even though machine learning is successfully and extensively used in a wide variety of applications, there is still a large disconnect between the highly dynamic nature of the most real-life environments (eg cities) and the static nature of most of the parameters used in learning. This project will investigate techniques for self-configuring and dynamically adapting learning parameters, eg. action sets, rewards or state space representations in reinforcement learning.

Transfer learning
Transfer learning is a machine learning technique that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem on the same agent, or in multi-agent systems transfering the knowledge to another agent. Most transfer learning techniques focus on learning tasks with a single goal, however, knowledge learnt for a single goal can often be influenced by other goals that an agent has. This project will investigate transfer of knowledge between agents with different but overlapping goals.

Sustainable Mobility:

Impact of autonomous car and ride sharing on urban traffic
Shared mobility is one of the key pillars of mobility-on-demand (MoD) systems. It is estimated that in 2015 nearly 8 million people used car-sharing services globally (eg GoCar in Ireland), with that number predicted to increase to 36 million by 2025. Similarly, user penetration of ride-sharing (eg Uber Pool, Lyft) was 9.8% in 2018 and is expected to hit 13.3% in 2022. As these models gain increased traction, their impact on traffic patterns within cities as well as parking demand will change. This project will investigate potential impacts of on-demand mobility on urban traffic congestion, traffic patterns and parking demand.

Optimizing ride-sharing requests
Car-hailing companies as Uber and Lyft have recently launched new ride-sharing services which are a competitive alternative to traditional public transportation system. Several riders can now share a vehicle with a lower fare, at the cost of small detours along their trips. From the system point of view, grouping riders with different origin and destination in a shared car is a challenge, and appears critical as to maintain profitability and encourage customers to share rides. This project aims at developing a method to optimize riders grouping within a given time-widow (ex: every 5 min) in a ride-sharing system and focusing on a vehicle-centred objective like maximizing profit or minimizing travel distance/time. The method will be validated using trips generated from NYC taxi dataset.

Prediction of autonomous taxi and ride-sharing requests [TAKEN]
As the use of car and ride sharing services increases, companies aim to increase the quality of service by minimize the waiting time and increasing availabilty of vehicles. Predicting customer demand is crucial to provide this service in order to position the vehicles in the areas where the demand is the most likely to arise. This project will focus on developing prediction techniques to estimate customer demand for taxi and ride-sharing services and perform validation on a real-world data set using NYC taxi dispatch data.

Stephen Barrett

Social Software Engineering

My research is focussed on the identification of the unique contribution and impact of the software engineering practice of individuals and the teams they work in. My approach is to treat software engineering as a sociological phenomenon, with source code as the primary artefact of a constructive social network process. My aim is to develop toolsets that capture expert domain knowledge regarding software engineering practice and management in a form that enables us, in the language of Daniel and Richard Susskind, to externalise its qualitative assessment.

The technical basis of the approach is the application of defeasible/non-monotonic argumentation schemes, borrowed from proponents of the strong AI model such as John L. Pollock, but applied to the assessment of human behaviour rather than the replication of human decision making. We apply this method to infer judgements regarding software engineering practice, this analysis being grounded in data derived from code quality measurement, software development process monitoring, and a social analysis of software construction.

This research work is being conducted in the context of a Haskell based software platform that gathers and processes 'ultra-large' scale data sets regarding active opensource and closed source software development. Projects sudents will thus need to be willing at least to take on Haskell as a programming language. Prior experience is not necessary but you should consider yourself to be a strong programmer to work with me.

Some example topics from which suitable projects can be developed include:

  • Automation of Software Development Methodology Adherence Testing: the use of fine grained behavioural measurement regarding software engineering to quantify adherence to development methodology. In this topic, we are interested in delivering practical tools and methods by which software teams can encourage and monitor process development goals.
  • Privacy Preserving Gamification of Software Engineering Processes: the use of gamification in the assessment and management of software engineering processes. In this topic, we are interested in exploring how gamification can positively impact on the performance of teams and individuals.
  • Situated Learning Framework for Software Engineering Community of Practice: the development of a model for the automated identification and recording of engineering activity for practice learning. In this topic, we are interested in developing ways in which the best practice and skill of senior and experienced team members can be automatically packaged as learning resources for the organisation.
  • Sociometrics in Software Engineering: the use of sociometric and biometric data to predict individual and team performance in software engineering. In this topic, we are interested in studying the environment and social network structure of software engineering teams in order to provide actionable measures of team performance and health.
  • A Platform for Social Software Engineering Research: the development of a scalable platform for social software engineering analysis. In this topic, we are interested in developing high level domain specific languages to enable sophisticated bespoke analysis by non-technologists of social network and software quality data pertaining to the software engineering effort.
  • High Scale Code Quality Measurement: a data evolution based cloud platform for the efficient computation and continuous re-computation of code quality metrics. In this topic, we are interested in exploring how predictive relationships might exist between various possible ways of measuring software engineering, such that more efficient and rapid result computation can be achieved.

Please note that I am unfortunately unable to take on projects outside this broad research space.

If these topics interest you, do send me an email, briefly summarising your interest, and software development experience.

thanks,

Stephen.

Fergal Shevlin, Ph.D.

Room

Email

Lloyd UB.73

fshevlin@tcd.ie

Note!

    I feel that project ideas conceived by students are usually the most interesting. If you have any ideas related to the following then let's talk about them to see whether we can specify an appropriate project tailored to your own unique interests and abilities.

    Android Vision

    Projects in the general area of "Computer Vision" (viz. image processing and analysis,) implemented on the Android platform for mobile devices. Thus the programming language(s) required would be at least Java with possibly some C/C++.

    Mathematical Methods

    Projects in the general area of "Mathematical Methods for Computer Graphics, Computer Vision, Robotics, Physical Simulation, and Control" implemented using appropriate method libraries. The most appropriate programming languages are likely to be C/C++ or Python.

Dr Hugh Gibbons

Room

Extension

LG20

1781

Support for Literate Programming in Java

Literate Programming is defined as the combination of documentation and source put together in a fashion suited for reading by human beings. It is an approach to programming which emphasises that programs should be written to be read by people as well as compilers.
There are many tools available to support Literatre programming but they are mostly available on Unix systems and for programming languages such as Pascal and C. While Javadoc is available to document Java programs, the aim of the project is investigate the benefit of using Literate Programming in Java.

Using CASL to Model a software system

CASL (Common Aalgebraic Specification Language) offers the possibility of formal specification and development while allowing a more informal working style. The project would investigate using CASL to develop a formal model of some software problem which may or may not have already been previously presented in formalisms such as VDM or  Z. This model could then be informally translated into a programming language such as Java.

Developing Programs using Perfect Developer or How to develop a Java program without writing Java.

Perfect Developer is a program development system which allows one to develop provably correct programs. First one develops the program in the notation of Perfect Developer and then the system can verify the program written. Once one has a correct program, Perfect Developer can automatically translate the notation to Java, C++ or Ada.
(See Perfect Developer: How it works)

Imperative Programming within a Functional Programming Setting

While Functional Programming (FP) supports Lists better than Arrays, it is possible to write FP programs that are based on arrays. Since FP programs are side-effect free, it is usually easier to prove FP programs correct than Imperative programs. The aim of the project is to develop Java type programs within an FP setting.

Simulating Puzzles or Games in Functional Programming

Over the last many years there have been successful projects in using Functional Programming to provide animations or simulations of puzzle type programs. Since Functional Programming languages such as Haskell are very high level language, expressing solutions to puzzle type problems may prove easier than in imperative languages or declarative languages such a Prolog or Lisp/Scheme. Possible puzzle problems would be cryptarithms where one has to fill in the missing digits in a arithmetic calculation, logic puzzles or puzzles involving Graph Theory. Puzzles and games from the works of Martin Gardner would be an interesting starting point.

Support Systems for Teaching Logic and Logic Proofs

Systems such as Tarski's World and Hyperproof have proved very valuable in teaching an understanding of both propositional and predicate logic. These systems are part of a more general Logic project Openproof at Stanford's Center for the Study of Language and Information (CSLI). An alternative logic proof system is provided by Jape, a system developed by Richard Bornat and Bernard Sufrin which supports Bornat's book Proof and Disproof in Formal Logic . A more modern Logic Proof System, KE, has been developed by Marco Mondadori and Marcello D'Agostino with associated computer program systems WinKE by Ulle Endriss and LogicPalet by Jan Denef. It would be useful to provide support tools for these system so that these systems could be more widely used. An example of a logic support system is given by the Logicola system by Harry Gensler to support his logic book Introduction to Logic.

Ruler and Compass Construction within Vivio

Vivio is an animation tool that allows one to animate algorithms and simulations. The project would involve investigating the use of this tool for creating classical Euclidean constructions, for example, the construction of a pentagon using a compass and ruler.

Program Transformation of Z into JML

The development of the Java Modelling Language (JML) was influenced by specification languages such as Z. Many software projects make use of tranforming specifications into imperative programs. An example of this approach can be seen, in particular, in the book "Introduction to Z", by Wordsworth. The examples in Wordsworth's book could be used as a starting point in transforming Z specifications to programs into JML.

Annotated Java with JML

The Java Modeling Language (JML) is a behavioral interface specification language that can be used to specify the behavior of Java modules. It is based on the approach of Design By Contract  (DBC) The draft paper Design by Contract with JML (by Gary T. Leavens and Yoonsik Cheon) explains the basic use of JML as a design by contract (DBC) language for Java. See also Joe Kiniry (University of Copenhagen)  presentation, Introduction to JML. A given project would investigate the use of JML providing examples of its use.  For example, how would a program for Binary Searching an array be implemented in JML.

 

Developing High Integrity Code in Spark

Spark is a high level programming language designed for developing software for high integrity applications. Spark encourages the development of programs in an orderly manners with the aim that the program should be correct by virtue of the techniques used in construction. This 'correctness by construction' approach is in marked contrast to other approaches which aim to generate as much code as quickly as possible in order to have something to demonstrate. Quoting from the book on Spark  "High Integrity Software, the Spark approach to Safety and Security " by John Barnes
" There is strong evidence from a number of years of use of Spark in application areas such as avionics and raliway signalling that indeed, not only is the program more likely to be correct, but the overall cost of development is actually less in total after all the testing and integration phases are taken into account"
SPARK will be familiar to programmers with knowledge of imperative languages such as C, Java and Ada. There is some effort involved with learning how to use the annotations correctly. 
A project using Spark would involve the development of reliable programs that can be proved correct by the Spark system.

Dr Gerard Lacey

Room

Extension

TTEC

087 2396567

Project

I am currently a part time academic as I am currently working in a Trinity spinout www.surewash.com My main research areas are computer vision, robotics and augmented reality. My research focus is the development and empirical evaluation of mixed media solutions using real world problems.

Problem / Background

Augmented reality(AR) is the overlay of interactive graphics onto live video such that it reacts to the content of the video image e.g. selfie filters that track face movement. Mobile phones are becoming one of the main platforms for AR. This project focuses on the tracking of hands in mobile phone images for gesture recognition, content overlay and gaming. General purpose hand pose tracking is a complex problem but custom hardware solutions: www.leapmotion.com , www.usens.com and complex software libraries: www.manomotion.com are available.

One of the biggest challenges is achieving high speed and reliable segmentation of the hands against real-world backgrounds and under variable lighting conditions. The next main challenge is the identification of the fingers and matching them to a hand pose model. If the hand gesture problem can be constrained this may be simplified and good performance achieved.

Solution / Goal of the Project:

This project will aim to develop a mixed reality application that will allow someone to “try on a glove” using their mobile phone. The goals of this project are to:

  • Reliably segment the hands on a mobile phone
  • Recognise and track the orientation of the hands
  • Render a 3D glove model over the live video image aligned to the hands
  • Develop a solution for finding an accurate measure of hand size
  • Develop a prototype application on iOS or Android
  • Perform user testing and formal evaluation of performance

The application will be developed using www.unity3d.com and www.emgu.com .

Glenn Strong

Room

Extension

ORI G.15

3629

Many of these projects involve some knowledge of functional programming. No prior knowledge is needed before starting the projects, we provide support for learning this new programming paradigm for students who have not been previously exposed to it. Of course, if you already know a language like Haskell you'll be able to start the project a little quicker.

Functional programming for Creative Arts

There are a number of Haskell embedded DSL's for creative coding of music and animation, for example TidalCycles, LSystems, a port of the processing language, and so on. There is a lot of scope to improve these languages and make them available to a wider range of users. A project in this area would require some knowledge of Haskell and an interest in the creative coding space.

Drag and drop Python

We have developed a prototype implementation of a structured programming editor for the Python programming language. There are several possible follow-on projects using a existing frameworks (such as Google's Blockly, Microsoft's MakeCode, etc). While there are some tools that can generate Python from Blockly programs, the source that the users work with don't tend to look much like Python programs. The goal with this project is to provide a Python-oriented system (perhaps in the style of the old Carnegie Mellon structure editors, or the Raise Toolkit)

Depending on the student's interests there are either pure development projects (adding more features to the existing prototype), or research projects investigating some open questions about how to design these environments and how effective they are.

Other projects

I am happy to discuss project ideas in the Functional Programming, Literate Programming, Theoretical Computer Science, or other similar areas. If you have a specific project in mind then send me an email. I am also willing to discuss software implementation projects with a bias towards rigour (using formal techniques, or design-by-contract ideas). I am also interested in creative ways to support novice programmers and in the study of Free Software related projects,

Mads Haahr

Room

Extension

Stack B, 2013

1540

AI for card-based game

Gambrinous is an indie game studio based in Dublin, best known for Guild of Dungeoneering. They are now working on their second game and have an opportunity for someone to improve the AI for the project.

The game is a 1-on-1 card game in the style of Magic: The Gathering, Ascension, or Hearthstone. The player plays against an AI opponent taking turns playing cards until one wins. We have a prototype built in Unity with a very simplistic AI.

Gambrinous are looking for the following improvements:

  • an improved tactical AI that plays smarter with the cards it has in a given turn
  • a strategic AI that works towards a plan or strategy over the course of many turns in a single match
  • the game involves blind-bidding for resources using the cards from your hand each turn. We'd like to see an AI that did things a human opponent would do (like bluffing or delaying tactics to see what the opponent is going for)
  • varied levels of AI strength (eg for easy/hard difficulty) and AI playstyles (eg for aggressive / rushing / slow build styles of play)
  • as part of this work it would also be interesting to pit AIs against each other in automated matches (also a great way of testing AI tweaks).

Lucy Hederman

Room

Extension

ORI G.13

2245

2018-2019

My proposed projects/areas should suit MSc CS students on the Intelligent Systems and Data Science strands and Final Year Projects for ICS and CSB students. To discuss, please email me at hederman@tcd.ie

Broadly I am interested in "data wrangling" for health IT and clinical research purposes.

The following projects relate to the AVERT project which is concerned with predicting relapses (or flares) of ANCA vasculitis, a relapsing and remitting rare autoimmune disease that results in rapidly progressive kidney impairment and destruction of other organs. Epidemiological data seem to show a strong environmental impact on relapse in ANCA vasculitis, though it is unclear which exactly environmental factors are responsible for this. The rapidly emerging discipline of data science - alongside massive increases in computing capability, machine learning and artificial intelligence - is poised to allow the incorporation of such highly complex health big data environments, and the generation of outputs with potential applicability in personalised medicine. We aim to integrate a wide array of unstructured data streams to define the signature of relapse of the disease. We believe this approach will represent a new paradigm in managing chronic conditions governed by interaction between patient-level factors and their environment, and could be scaled up if successful for use for other autoimmune diseases.

Data integration for AVERT is using linked data principles. Different streams of data are combined in an RDF triple store.

Enriching patient data with weather, pollution and infection data based on their location

Once we know where an AVERT patient has been, we need to attach environmental data to their record, for subsequent analysis. This project will connect to various sources of weather, pollution and infection data, query for the relevant items, and add the results to the linked data (RDF) graph. Challenges will inlcude variety of data sources (web APIs, locally available models, csv files, etc.), transforming data at certain locations to a value for the location of the patient, and attaching appropriate metadata to the environmental data to record how it was arrived at.

AVERT staff user interface to patient app data - TAKEN

This project is probably suitable for a FYP. AVERT participant patients use an app, patientMpower, to record health data and location data. The research nurse needs an intuitive interface to review the data in the app developers database, about who is enrolled, what phone types they are using, who is recording data actively, etc. The project will involve SQL, web forms, and engagement with the end users.

AVERT application prototype - TAKEN

The end goal of AVERT is to have a "realtime" decision support system to predict flares from realtime patient and environment data and advise clinicians on treatments. This project will develop a prototype of such a system, in anticipation of a future flare prediction model. It will combine data from mobile app, clinical records, and location based weather and pollution data (where available), feed it to a "black-box" flare model, and present the output in some form that would be useful to clinicians, and possibly patients. This project is principally a software enginering project, building a system to pass data between components and providing controls on those data flows.

Making clinical research project data shareable outside the project

The AVERT project hopes to make its data available, in structured, semantically interoperable, de-identified, form, to other researchers, as part of an "information commons". This project will need to explore a broad range of technical and non-technical issues in devising a safe, useful and usable solution. How do bio-scientists work with data? How does data protection impact science? How do we ensure shared meaning of data? How do we protect patient identity? etc.

Driving AVERT app user engagement

(Suitable for an MSc CS (Information Systems) student).

This project seeks to use state of the art (machine learning) techniques to source and serve content of relevance for the vasculitis patient group, with a view to increasing patient engagement with the app.

Non-AVERT project :

Develop an Irish Traveller genetic disease diagnosistic web resource

Clinical experts in Irish Traveller disease want to build a web resource to aid timely diagnosis of rare genteic disease that affect Irish Travellers. The project involves designing a building database of the relevant diseases so that users of the web resource can enter signs and symptoms and get a shortlist of potential diseases and advice on pursuing the diagnosis further. There is something similar for Anabaptist disorders (see the Search Database options). There is scope to extend the project in various ways to an MSc dissertation project.

Updated 19 September 2018

Hitesh Tewari

Room

Extension

Lloyd 131

2896

Last Updated 7th September 2018


I have a number of projects ideas in the area of network security and cryptography. In particular I am active in the area of Distributed Ledger Technologies. My projects require prespective candidates to have a strong mathematical background, along with good programming and analytical skills.

Distributed Consensus using Directed Acyclic Graphs (DAGs)

Implementing Smart Contracts with a subset of the C Programming Language

Interoperability of Blockchains

Blockchain Network Slice Broker in 5G



Meriel Huggard

Room

Extension

Lloyd 1.09 (inside lab 1.11)

3690

I'm happy to supervise projects at both senior undergraduate and MSc level. In recent years I've predominately supervised MSc projects.

I'm offering a variety of project in 2017/18, some of which include the opportunity to collaborate with colleagues in the USA.

Project I and II: Using Machine learning to monitor and predict Quality of Experience AVAILABLE

    The separate areas of machine learning and wireless/cellular connection quality management (through both Quality of Service(QoS) and Quality of Experience(QoE)) have rarely been combined. Recent research has led to algorithms which focus on providing solutions which employ machine learning models to enable distributed QoE monitoring and prediction. This project seeks to evaluate the performance of these novel algorithms and to compare their performance with existing QoE estimation techinques. .

Project III: Quality of Experience based Admission Control in Cellular Networks AVAILABLE

    5G networks are expected to be able to accommodate a very large number of wireless devices and users. One way of handling the high volume of traffic expected on these networks is through the use of small cells and by handover to wifi and other networks. This project will use simulation methods to evaluate the performance of admission control and handover algorithms for these systems for different traffic mixes and loads.

Project IV: Control Systems AVAILABLE

    Dynamic Watermarking in Cyper-physical systems
As we deploy large scale networked cyber-physical systems, they become more vunerable to attack. For example, sensor data may be tampered with, causing actuators to become malicious agents. One approach that can be adopted to mitigate this is to watermarking the data transmitted by the sensors, so that the system can detect when signals have been tampered with. This project will explore and evaluate watermarking techniques for such large scale distributed systems.

Project V: Your project ideas... AVAILABLE

    If you have interesting ideas in the networking/communications/control domains pelase feel free to get in touch. You will need to be able to clearly articulate (i) what you are proposing to do, (ii) what the underlying research question/hypothesis you intend to explore is, (iii) why the outcomes of your work will be of interst to others and (iv) why you want to do this project.
For futher information or to arrange a meeting: :wq:wqMeriel Huggard

Georgios Iosifidis

SCSS and CONNECT Centre

Lloyd Institute

 

 

Project 1

 

Title: Resource Orchestration in Hybrid Cloud-Fog Networks

 

Background: The emerging fifth-generation (5G) wireless networks are expected to offer various mobile services such as ultra high definition video delivery, augmented reality  services, and machine learning-based applications. Due to the limited capabilities of the users devices, these demanding mobile services can only be delivered with the support of cloud computing and storage resources. The latter can be located in distant data-centers, or in proximity with the end-users e.g., in Cloudlets or even nearby mobile devices. According to Forbes and Economist [1, 2], Cloud and Fog computing solutions are attracting increasing interest as a promising and cost-efficient solution for next generation communication networks. However, these hybrid architectures induce substantial network bandwidth costs as well as very high energy consumption in data centers, especially under high-load conditions. This is currently one of the largest obstacles hampering the large-scale adoption of these promising solutions.

 

Goals: In this project, we will design algorithms for jointly optimising the allocation of computation, storage and communication resources that are located at the Fog (in proximity with the devices) or the Cloud, aiming to increase the quality of the offered services and reduce the systems expenditures. We will leverage android programming tools (e.g., https://developer.android.com/studio/index.html based on Java) to make mobile  applications and MATLAB programming to execute trace-driven large-scale simulations and data processing. In the final step, this project will analyse the quality of service of various cloud-based applications and efficiency of the resource management by performing experiments using mobile devices, local computing/storage servers and cloud platforms, e.g., Microsoft Azure [3]. 


Student Info: This project is particularly suitable for M.Sc and MAI students interested in Cloud/Fog architectures and (i) system modeling and analytical methods (i.e., optimisation) and/or (ii) system design and performance evaluation. The student will collaborate with Dr. J. Kwak (https://sites.google.com/site/jeonghokwak/home) and Dr. G. Iosifidis (www.FutureNetworksTrinity.net); and will have the opportunity to participate in ongoing (fast-paced) research projects and acquire important analytical and technical skills.

 

References

[1] Economist, Shifting computer power to the cloud brings many benefits—but dont ignore the risks, URL: https://www.economist.com/news/leaders/21674714-shifting-computer-power-cloud-brings-many-benefitsbut-dont-ignore-risks-skys-limit

[2] Forbes, Is Fog computing the next big thing in Internet of Things? URL: https://www.forbes.com/sites/janakirammsv/2016/04/18/is-fog-computing-the-next-big-thing-in-internet-of-things/#64279faf608d

[3] Microsoft Azure, URL: https://azure.microsoft.com/en-gb/?&wt.mc_id=AID623280_SEM_

 

 

Project 2

 

Title: Economics of the Internet of Things

 

Background: The promise of the Internet-of-Things (IoT) is to enhance our physical world with connected and intelligent devices that can respond in real-time to environmental conditions, perform tasks with increased precision, augment human capabilities by operating in a semi-autonomous fashion, and improve resource utilization. Applications of IoT can be found in manufacturing, traffic control, energy grid, electric vehicles, environment monitoring and many other domains [1], [2]. IoT is expected to have a profound impact on our economy and society and is currently subject to intensive research in industry and academia. A particularly promising feature of IoT devices is their capability to interact with each other so as to jointly perform a task, or coordinate the execution of their missions. For example, consider a set of sensors that jointly monitor certain environmental parameters (e.g., air pollution) in a given area. The sensors might belong to the same or different business entities (e.g., different companies) and might have overlapping coverage. This enables them to cooperate and exchange measurements or support each other in case of failure, improving this way their performance and reducing their costs.

 

Goals: In this exciting era of ubiquitous connectivity that extends from humans and large systems to small-scale devices, a new type of cyber-physical economy emerges offering novel opportunities for fruitful collaboration among the users and their devices. In this project we will design and evaluate algorithms that enable IoT devices to cooperate by exchanging resources (such as energy and wireless bandwidth) and jointly improve their performance. We will combine tools from dynamic optimization and game theory to develop solutions that achieve efficient equilibriums [4]. Different market scenarios will be considered, ranging from fully decentralized (peer-to-peer) to hierarchical markets where more resourceful users/devices sell their resources or services to smaller IoT nodes. The designed algorithms will be thoroughly evaluated in Matlab and/or R.

 

Student Info:  This project is suitable for students interested in algorithms, optimisation, market mechanisms (game theory) and IoT business models. The student will collaborate with Dr. G. Iosifidis and CONNECT [3].

 

References

[1] L. Atzori, A. Lera, and G. Morabito, The Internet of Things: A Survey, Elsevier Computer Networks, vol. 54, no. 15, 2010.

[2] Cisco, Fog Computing and the Internet of Things: Extend the Cloud to Where the Things Are, White Paper, 2015; URL: https://www.cisco.com/c/dam/en_us/solutions/trends/iot/docs/computing-overview.pdf 

[3] CONNECT Centre, Pervasive Nation IoT Platform; URL: https://connectcentre.ie/pervasive-nation/

[4] G. Iosifidis, and L. Tassiulas, Dynamic Policies for Cooperative Networked Systems, in Proc. of ACM NetEcon Workshop, 2017, Boston, USA.

 

Dr Jonathan Dukes

Email

jdukes@tcd.ie

Room

Room F.27, O'Reilly Institute

Firmware Updates for LoRaWAN End Nodes

[AVAILABLE]

LoRaWAN is a Low Power Wide Area Networking (LPWAN) technology for occasional, low data rate communication from (and occasionally to) resource-constrained devices in the Internet of Things. This project will investigate the feasibility of performing remote device firmware updates (DFU) on resurce-constrained LoRaWAN end nodes.

Prior work has investigated the feasibility of transferring firmware images over LoRaWAN. This project will build on the prior work by investigating (i) the installation of new firmware images on the end nodes, (ii) the design and implementation of a suitable control protocol for DFU over LoRAWAN and (iii) the implementation of a prototype cloud-hosted service for managing DFU.

Firmware Updates for the Zephyr Embedded Operating System

[AVAILABLE]

The Zephyr Project is an open source operating system for resourced-constrained devices in the Internet of Things. This project will design, implement and evaluate a device firmware update (DFU) mechanism for Zephyr.

This project may be undertaken for any one of a number of wireless communication technologies (e.g. Bluetooth Low Energy (BLE) or RFC7668 (IPv6 over BLE)).

RFC7668 Gateway

[AVAILABLE]

RFC7668 is an IETF standard for the transmission of IPv6 packets over Bluetooth Low Energy. This project will design, implement and evaluate a gateway router for RFC7668. Some of the features that may be considered for implementation include: (i) centralised (cloud) management of multiple gateways, (ii) load balancing across multiple gateways, (iii) support for mobility.

Teaching Tools for ARM Assembly Language Programming

[AVAILABLE]

This project will implement tools to support teaching ARM Assembly Language Programming. Of particular interest is the implementation of automated tools for testing the functionality of ARM Assembly Language programs, as well as providing mechanisms for instructor feedback through annotations.

One approach that may be adopted by the project is the development of a "back-end" for an existing tool for evaluating and grading coding assignments.

Other Projects

Please contact me if you have an idea for your own project in a similar area to the projects above, or generally in the area of embedded systems, low-power wireless network protocols, "Internet of Things" applications, or Bluetooth Smart (aka Bluetooth Low Energy). I would also be interested in supervising projects involving media streaming.

Last update: Thursday 13 September 2018

Dr Jeremy Jones (updated 24-Jul-18)

Top floor South Leinster St. room 4.16

1.      DNA Analysis: BWBBLE is a typical bioinformatics program used to analyse DNA. It is written in C and is used to locate the positions of many millions of short DNA read sequences in a reference genome. The reference genome contains 3x109 (billion) base pairs and the short DNA sequences 30 - 300 bases. Each base is represented by a character (e.g. A, C, G and T). The analysis is naturally parallel. BWBBLE has been ported to the Google Cloud Platform using SparkBWA as part of a 2017-18 final year project. With hindsight, this approach was overly complex and did not give that kind of speedup expected or even possible. The objective of this project is to speed up the analysis by running BWBBLE in parallel on the Amazon Cloud using a more straightforward approach.

2.      ThreadPool implementation: a program may proceed in phases where multiple threads are created for each phase. A threadpool is used to reduce the cost of continually creating (and terminating) threads for each phase. A pool of threads is created initially which are then reused in each phase leading to greater efficiencies. The object of this project is to (i) determine the cost of creating (and terminating) threads (ii) implement a portable threadpool  with a simple API for Windows and Linux and (iii) determine if greater cache efficiencies can be obtained by directing similar work from different phases to particular CPUs.

3.      Radix Sorts:  Sorting plays a very important role in string searching and DNA analysis (e.g. the Burrows Wheeler transform). The objective of this project is to analyse the performance of a number of parallel sorts to determine when, for example,  a radix sort will be faster than a quicksort  given a large array of integers (hundreds of millions of integers) .

4.      ARM CPU animation: Most of you will have used the MIPS/DLX animation as part of the CS3021/3421 Computer Architecture II module. The objective of this project is convert the DLX/MIPS animation into ARM CPU animation for use by students. This will consist of two parts (i) implementing the instruction set and (ii) trying to match the pipeline to a real ARM CPU.

These projects could form the basis of a final year project, year 5 dissertation or taught MSc dissertation.


 

main.2018-2019.shtml

Prof. Khurshid Ahmad

    My principal area of interest is in artificial intelligence, including expert systems, natural language processing, machine learning and neural networks. My other area of interest is in the ethical issues of privacy and dignity in the realm of social computing. I have recently finished a largish EU project on the impact of social media (monitoring) in exceptional circumstance (slandail.eu) C this brings together social media analytics (natural language processing/image analysis), geolocation, and ethics. My research has been applied to price prediction in financial markets and in forecasting election results. The projects on offer in this academic year are:

  • 1. Social Media Analytics and monitoring:

    Microbloggs and social networks are large and noisy source of data that is valuable for marketing and sales specialists, law enforcement agencies, disaster NGOs, and policy makers. This project will help you in acquiring social media data, in using natural language processing techniques for processing this data, and techniques for visualise the results of the analysis. You will be expected to include a brief discussion of questions of privacy and ownership of social media users. A proficiency in Java and/or Python is required for this project.

  • 2. Sentiment Analysis:

    This is an exciting branch of computer science that attempts to discover sentiment in written text, in speech fragments and visual image excerpts. The sentiment is extracted from streams of texts (messages on social media systems, digital news sources) and quantified for inclusion into econometric analysis or political opinion analysis systems that deal with quantitative data like prices or preferences: the aim is to predict price changes or ups and downs of a political entity. You will write a brief note on questions of divulging identities of people and places. You have an option of developing your own sentiment analysis system or use a system developed in my research group.

  • 3. Machine Learning and Big Data:

    Large data sets, for example genomic data, high-frequency trading data, meteorological data, and image data sets, pose significant challenges for curating these data sets for subsequent analysis and visualisation. Automatic categorisation systems, systems that have learnt to categorise arbitrary data sets, are in ascendance. One fast way of building such systems is to integrate components in large machine learning depositories like Googles TensorFlow, MATLAB, Intel Data Analytics, to build prototype systems for text, videos, or speech streams for instance. Issues of data ownership will be briefly outlined in your final year project report.

  • 4. Cryptocurrencies and Market Risks

    Cryptocurrencies are designed for securing the integrity and trustworthiness of complex transactions in a distributed environment at high speeds. However, there have been cryptocurrencies that have crashed and some have more volatility, in a technical sense, than, say, is the case for conventional transactions including 'real' currencies, stocks, bonds and commodities. The external environment appears to influence the risk profile of cryptocurrencies, including consumer behaviour and attitudes.  Equally importantly, the variation in the cryptocurrencies can be correlated assets transacted conventionally.  In this project you will specify, design and build a prototype for studying the behaviour of these currencies together with key variables of the external environment.  There are arrangements underway, though not fully completed, which will enable you to work with a large blockchain research initiative undertaken by a UK consultancy in this area.

          

Dr. Kenneth Dawson-Howe

I supervise projects which have a computer vision component (and most particularly in the area of object tracking at the moment!!). To give you a feel for this type of project have a look at some previous projects. For further information (about project organisation, platform, weighting for evaluation, etc.) see this page. If you want to talk about a project or suggest one in the area of computer vision my contact details are here.

The following are my proposals for 2018-19. If any of them catch your imagination (or if you have a project idea of your own in computer vision) come and talk to me or send me an email.

Illustration. Project Details.

CPR Assistant.
Cardiopulmonary resuscitation (CPR) is a technique used to keep oxygenated blood flowing in the human body when the heart and breathing of a person have stopped. It is a repeated combination of chest compressions (30 in a 15 second period) and artifical breaths (2 within a 5 second period), and has to be continued until other measures are taken to restore spontaneous operation of the heart and breathing. Following a project last year where the Chest Compression Rate (CCR) was successfully determined using a mobile phone app we want to continue this work potentially through two additional projects:

  1. (TAKEN) We want to try to evaluate the Chest Compression depth (CCD) which is as important as the CCR. This may be attempted using the same 2D video images from the camera phone/angle from the project in 2017-18 or could be attempted using an iPhone X where 3D information would be available. If possible we also want to look at the posture of the person giving CPR as it is asserted in the literature that the arms must be kept straight to give proper compressions.
  2. (TAKEN) We want to consider the evaluation of the CCR from different camera angles (e.g. if the phone was flat on the floor looking upwards) and/or if the phone was obscured (i.e.\ could we evaluate the CCR from the audio stream).
Note that these projects will probably be done in conjunction with a First Responders organisation in South Dublin (who allowed us to take test footage last year).
Image linked from another site

Driver fatigue.
(TAKEN) Driver fatigue is estimated to cause around 20 percent of accidents. This project will develop a system either on a mobile phone or on a raspberry pi to automatically monitor a driver using a camera and provide an audible warning when they are showing signs of drowsiness. Ideally the project will develop an app which can work on existing phones.


Automatically analysing the Visual Landscape.
We want to look at developing tools to aid in the automatic evaluation of the ``Linguistic Landscape''. What this means is looking at various aspects of signage which appear in our environment and using the information as part the evaluation of society. For example there is published work which looks at the prevalence of Hebrew, Arabic, English, and other languages in regions of the Middle East. Note that these projects are available to students on all programmes who are taking (or have taken) computer vision / image processing modules. This is huge field and we want to run two projects this year:

  1. (TAKEN) Development of a system to automatically locate and extract all areas of text within complex images and to identify the language. The annotation would need to be stored in an XML format for each image. As part of this development a tool will be required to allow the manual annotation & editing of the Linguistic Landscape in image respositories (so that any mistakes can be corrected and so ground truth can be specified for the images analysed. As part of this project we might look at evaluating the linguistic landscape in part of Northern Ireland or the Republic - potentially using Google StreetView to provide the imagery.
  2. (AVAILABLE) Part of the Linguistic Landscape is identifying where an image is taken and so we want to automatically classify images as 'single sign', 'shopfront', 'street view', 'corner view', etc. This *might* be appropriate for a deep learning system but I have to warn that I'm not a fan!!
These projects will be run in conjunction with Dr. Jeffrey Kallan in the Linguistics Department

Prof Doug Leith 2018-2019 Projects

The following projects are related to recent research in online privacy and recommender systems being carried out here in TCD, see www.scss.tcd.ie/doug.leith/. Most of them could form the basis of a Year 5 dissertation or of a final year project.


Timing-Based Attacks Against OpenNym TAKEN

OpenNym is a peer-to-peer system for web cookie sharing/management. Our online "identity" is largely managed via cookies set by web sites that we visit, and these are also used to track browsing history etc. With this in mind, recently a number of approaches have been proposed for disrupting unwanted tracking by sharing and otherwise managing the cookies presented publicly to the web sites that we visit, the aim being to allow personalisation to still occur (so simply deleting cookies is not a solution) but avoiding fine-grained tracking of individual user activity. However, the timing of user interactions with a web site may provide a vector for carrying out linking attacks e.g. if a user submits ratings to a recommender system at similar times then they might be clustered by an attacker and so linked back to the user. This project will investigate approaches for disrupting such timing attacks, the two main methods being by injecting dummy interactions and by buffering interactions so as to disrupt timing and ordering patterns.


Tracking the Trackers TAKEN

User tracking and monitoring by web sites is becoming more sophisticated. Increasingly web sites use javascript to track fine-grained details of user activity such as mouse movements. New HTML5 functionality such as websockets provides new communication channels that are largely invisible to users yet can provide much of the same tracking capabilities as cookies, and this can be augmented by browser fingerprint approaches using not just the HTML5 canvas object but also the sound playback etc. This project aims to build a browser extension that monitors the activity of website javascript and makes such unusual/interesting activity visible to users in real-time. If time allows machine learning methods might be used to help classify detected activity as unusual/interesting based on user preferences and history.

References:


Google No Click

In the future it is likely that search engines will encrypt the urls contained in the displayed results so as to assist with tracking of user clicks (since to decrypt the url the users browser needs to contact the search engine and in this way the search engine can log that a click has occurred). However, a previous TCD project showed that the snippets of text displayed with google search results is sufficient to infer the search result url with high accuracy and so avoid such tracking. To create a workable solution however requires sharing and management of these snippets amongst users. This project will investigate the use of a peer-to-peer snippet database for this purpose based in the IPFS DHT. A key issue will be not only ensuring efficiency and scalability but also privacy i.e. that submitting links to the database or using the database to convert snippets to urls does not disclose user browsing history.
Reference:


Switchable Network Identities for Enhancing Privacy

This project will look at mechanisms for changing the set of identifiers (DHCP IDs, MAC addresses, IP addresses, TCP sessions etc) presented by user devices to the network and which can be used to link traffic to a specific user across time. This is current of interest to Android etc to enhance user privacy, but a key challenge is to switch identifiers without disrupting connectivity too much and so an awareness of context is likely to be important (e.g. an employers wifi hotspot may have access control based on MAC address but at home access is only restricted via a WPA password). Switching identities also creates new opportunities for personalisation while maintaining some privacy e.g. if people with similar interests share the same set of identities then personalisation is still possible while users are provided with a kind of "hiding in the crowd" privacy.


Mobile Handset Anomaly Detection TAKEN

Mobile handsets are largely black boxes to users, with little visibility or transparency available as to how apps are communicating with the internet, trackers etc. Handsets are also potentially compromised devices in view of the relatively weak security around apps, and so monitoring activity in a reliable way is important. This project aims to carry out a measurement study to record actual mobile phone network activity with a view to making this more visible to users (via a dashboard) and highlighting anomalies/potentially interesting activity. By routing traffic through a VPN we can log traffic externally to the phone in a straightforward way, so the main challenges are (i) organising a study to collect data for multiple users, (ii) developing a usable dashboard for them to inspect their own traffic and (iii) exploring machine learning methods for classifying traffic and anomaly detection. This project will include the opportunity to collaborate with Dublin-based startup Corrata.
References:


Mobile Serverless Apps for Wireless Augmented Reality TAKEN

Next generation mobile apps such as wireless augmented reality require opportunistic offloading of processing to the cloud while maintaining low latency operation (so as to keep with the camera frame rate). Amazon lambdas provide a much more lightweight abstraction than containers for implementing such offloading. This project will first investigate the creation of a framework for allowing lambdas to be executed locally on a mobile handset, so reducing latency when local computation is feasible. This will then be combined with smart decision making which opportunistically decides whether a given lambda call should be executed locally or offloaded to the cloud (based on networking conditions, battery level, latency requirements etc). The project will involve android app development and use of amazon lambdas for image processing.


Pluggable Tor Transport

The Tor browser supports use of pluggable transports to allow shaping of traffic sent over the network so as to obfuscate its nature. These transports can also be used with VPN tunnels, e.g. see Shapeshifter-dispatcher. The aim of this project is to implement a pluggable transport based on recent research here in TCD on making making traffic resistant to timing analysis attacks without the need to introduce high latency or many dummy packets. We'll take one of the existing Tor transports as a starting framework and then modify it as needed. The project will require good programming skills, but its a great chance to contribute to Tor's development and improve existing VPNs.
Reference:

Dr Alfredo Maldonado

ADAPT Centre Lab, Ground Floor O'Reilly Building

If you're interested in one of my projects, send an email to alfredo.maldonado@adaptcentre.ie with "PROJECT IDEA" as subject line.

These projects focus on neural word embeddings, their semantic properties and alternative ways of constructing and/or applying them.

Word embeddings are the input layer to modern neural approaches to practically all kinds of Natural Language Processing (NLP) tasks like text classification, named-entity recognition, machine translation, you name it. When a neural NLP system processes text, it 'translates' each input word into a word embedding vector and passes that vector to the subsequent layers of the neural network for further processing.

Word embeddings are usually trained only once on large generic raw text corpora (like Wikipedia dumps or huge amounts of crawled web pages) and are rarely created from scratch for a specific application in particular. Indeed, that's part of their beauty: you don't have to spend lots and lots of (computer) time (and electricity!) calculating word embeddings for every application you have in mind. You can just use pre-trained word embeddings created by someone else and reuse them on each application. However, would word embeddings trained on a 2009 edition of Wikipedia be optimal for extracting medication side-effects complaints from patients' social media posts today?

There are several methods for creating word embeddings from raw text, such as word2vec (CBOW and Skip-Gram), GloVe and FastText. They all produce distributional vector representations of words. This means that if two words tend to appear more or less with the same type of words, then their vector representations would be similar. This exploits the so-called distributional hypothesis, which says that if two words co-occur with the same words, then their meanings are related. For example, 'car' and 'truck' would each tend to co-occur with words like 'road', 'traffic', 'drive' and 'speed'. And indeed, both 'car' and 'truck' are semantically related: they are examples of 'motor vehicles'. It then comes as no surprise that the word embedding vectors for 'car' and 'truck' have a high degree of similarity, no matter which method you used to compute the vectors themselves.

These methods encode these semantic similarities because they have observed many words in the company of other words many, many times. That's why they need large corpora, so they can get lots and lots of usage examples for as many words as possible. However, if the words you care about are rare, new or highly specialised (like medical/legal/scientific terms), it might be difficult to find lots of examples of those words in publicly available text. A proposed alternative is to train embeddings on knowledge resources that explicitly describe word semantics, instead of or in addition to large text corpora. These knowledge resources can be lexical taxonomies like WordNet, or specialised ontologies and terminologies like SNOMED-CT (from the biomedical domain) or EuroVoc (a large multidisciplinary and multilingual thesaurus that covers the activities of the EU). I call concept embeddings those embeddings that have been trained on knowledge resources.

The projects below seek to compare word embeddings with concept embeddings in different settings. I'm also open to other project ideas that fit in this space, or to tweak one of the projects below to your particular interests.

Identification of specialised-language named-entities and/or concepts: word vs concept embeddings

Neural methods are the current state of the art in Named-Entity Recognition (NER). However, evaluations have mostly been done on the usual general-domain entities of Person, Organisation and Location. This project aims to identify concepts and/or entities from a specialised domain (legal, biomedical, information technology, art, etc.) More concretely, this project seeks to find out if a neural NER using concept vectors performs better on this task than the same neural NER using generic word vectors.

The project will use distant supervision techniques in order to obtain large amounts of training data but it may involve the manual annotation of a small corpus for evaluation purposes. (We'll try to leverage as much publicly available data as possible).

Gender and/or ethnic bias in word and concept embeddings

Research has shown that word embeddings learn gender biases present in the text from which they are trained. For example, vectors for occuaptions such as 'housekeeper', 'nurse' and 'secretary' tend to be closer to the vector for 'women' whilst vectors for 'carpenter', 'mechanic' and 'engineer' tend to be closer to the vector for 'men'.

We human beings are usually unaware of our own biases. But we exhibit them in our behaviours, like our linguistic expressions (oral and written), which is why they end up in text in the first place. Knowledge resources like WordNet or Freebase/Wikidata are carefully constructed to explicitly describe concepts and their relationships. As far as I am aware, there has not been efforts to determine if gender bias (or any other type of social bias) is present in these knowledge resources. One way to find out if concept embeddings are biased would be to explicitly test them, just like traditional word embeddings were.

Depending on time available, we can also investigate whether bias as an artifact encoded in word embeddings can be used to detect bias in text from say social media, news outlets or essays or opinion pieces.

Some references in this topic:

  • Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183–186. http://doi.org/10.1126/science.aal4230
  • Garg, N., Schiebinger, L., Jurafsky, D., & Zou, J. (2018). Word Embeddings Quantify 100 Years of Gender and Ethnic Stereotypes. Proceedings of the National Academy of Sciences of the United States of America, 115(16), 3635–3644. http://www.pnas.org/cgi/doi/10.1073/pnas.1720347115
  • Hovy, D. (2015). Demographic Factors Improve Classification Performance. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (pp. 752-762). Beijing.

Dr. Michael Manzke

https://www.cs.tcd.ie/Michael.Manzke/

Room

Extension

02-012 Stack B

2400


Some general info

I am a member of the Graphics, Vision and Visualisation research group interested in the areas of:

  • michael
  • peter
  • tbd
  • tbd
  • tbd

Suggested Projects:

  1. Topics
    • tbd
    • tbd
    • tbd
    • tbd
    • tbd
  2. tbd
    • tbd

Dr. Kris McGlinn

3D Building Information Modeling with Googles Tango

Googles project tango is a tool for smart phones and tablets for creating 3D models of buildings (or apartments) by simply carrying the device around! In this project you will explore the use of tango to develop 3D models, and examine methods for annotating and linking these models to existing building information models standard like IFC (Industry Foundation Classes). You will examine how objects are identified and labeled using the tango sdk, you will see if you can tag those objects and export them as IFC entities. You will see whether walls and other objects can be given additional properties, like materials and thickness. This integrated data can then be used for different applications, ranging from navigation too energy simulations.

Dr Richard Millwood

Room Lloyd Institute 0.29
Extension 1548

Keywords: education - programming - HCI - emulation - learning design

My background is in education and technology, I am course director for the MSc Technology and Learning. You can read more about me at richardmillwood.net and you can also have a look at my recently completed PhD by Practice.

Here are some areas of interest which may inspire a project, some based on my research plan:

1. Learning computer programming

Two ideas here:

  1. Developing in Blockly to meet some of the challenges made in my blog on Jigsaw Programming
  2. Constructing an online research instrument for tapping in to teachers' tacit knowledge about teaching computational thinking. This may be an extension using Python to an existing content management system such as Plone to add functionality for innovative interactive forms of survey.

2. Collaborative support in learning computer programming

This is directed at 'educational github' to suit young learners. The development will create an interface that better clarifies and support the roles and workflow in collaborative work online so that this can be more readily learnt in use. It is not clear exactly what software development would be appropriate and would suit someone with imagination and drive to be very creative.

3. UK National Archive of Educational Computing apps

The design and development of device-responsive educational apps (for mobile, tablet and web) based on historical educational programs, such as Snooker:

  1. Original Snooker - from 1978 in BASIC
  2. Prototype Snooker Angles - from 2013 as iPhone Web app

Key features are that the app includes a faithful emulation of the original educational program as a historical account, and that the modern app maintains similar educational objectives but may be updated to take advantage of new technology and new pedagogy. The app must be able to scale appropriately and work on phone, tablet and web page. This is an HTML5 development project using Scaleable Vector Graphics for interactive visuals.

I have a list of apps that I have prioritised to support the the UK National Archive of Educational Computing.

Dr Martin Emms

Room

Extension

O'Reilly LG.18

1542

I would be interested in supervised FYPs which centre around applying computational techniques to language.

Machine Learning and Word Meanings

An interesting question is the extent to which a machine can learn things about word meanings just from lots of textual examples, and there has been a lot of research into this (see Wikipedia intro), all based on the idea that different meanings show up in rather different contexts eg.

move the mouse till the cursor ...

dissect the mouse and extract its DNA ...

Several kinds of project could be attempted in this area, starting either from scratch or building on code that I could supply.

  1. One kind of system would do what people call unsupervised word sense disambiguation, learning to partition occurrences of an ambiguous word into subsets all of which exhibit the same meaning.

    Someone last year added the diachronic twist of attempting to recognise, from time-stamped text, that semantic matters concerning that word have undergone a change over a period of time: mouse (1980s) and smashed it (last 10 years?) have acquired novel meanings.

  2. Another possibility is to investigate to what extent it is possible to recognise that a particular word combination has a non-compositional meaning, that is a meaning not entirely expectable given its parts, for example that shoot the breeze means chat

There are a number of corpora that can be used to drive such systems, such as the Google n-grams corpus, spanning several hundred years (viewable online here, also available off-line)

Others

I have interests in (and sometime knowledge in!) a number of areas in which a project would be possible

Projects Exploiting Treebanks
We have copies of several so-called treebanks, which are large collections of syntactically analysed English. Each treebank contains a large number of items of the following kind

\includegraphics[width=3in]{grace.eps}

One issue that such corpora allow the empirical exploration of is whether or not Multiple Centre Embedding occurs in English (left, right and centre embedding is illustrated below):

[S I think [S he said [S I'm deaf]]] right embedded
[NP$_{gen}$ [NP$_{gen}$ john's] father's] dog left embedded
[NP the candidate that [NP the elector that I bribed] chose] centre embedded

Tree distance
Ways to calculate the difference between two trees, and things you might do with that (a paper)

Lambek Calculus Categorial grammar
for anyone keen on Logic, a logic inspired way to do grammar

Continuations of projects from 15-16

Virtual Erasmus
Erasmus students spend a year of their study in a distant land. The idea is to simulate as much as possible of that process into a program that outgoing students could interact with, probably telescoping a year out into a few weeks. One part would be to emulate the bureaucratic obstacle course that such a year entails: you want to open a bank account, is it open Tuesday afternoons, have you got your residence permit, you want your residence permit, have you got your insurance, have you got another 3 passport photos ....

There is the possibility of continuing a development of this from last year

Scrabble
Someone wrote an interactive scrabble game, with computer opponent. This could be reworked and developed further.

Dr. Eamonn O Nuallain

13.3.1

3691

WR 13.3.1

3691

Radio Tomography, RF Coverage Prediction and Channel Modelling in Wireless Networks My postgraduate students and I research Radio Tomography (seeing through things with Radio Frequency (RF)), RF propagation and wireless channel modelling. The objectives of this research are to enable the rapid and accurate prediction of RF Coverage, frequency domain response, data throughput and interference mitigation in wireless networks - in particular as it relates to Cognitive Radio and MANETS. We are also interested in intelligent handoff algorithms. Such information is of great interest to wireless network planners and regulators. We are concerned with both fixed and ad-hoc networks. The methodology we employ is largely computational mathematics, computational electromagnetics, RF propagation, and mathematics. We research, develop and code numerical techniques to acheive our objectives. We test our results against field-measurements. We are also very interested in exploiting the capabilities of paralell computing and software radio. Most of our programming work is done in C, C++ and Matlab and using the NS3 simulator. These projects are available either as Final Year Projects or Masters dissertations. You should be interested in mathematics, ad-hoc networks and RF. If you are interested in pursuing a project in this area then contact me by e-mail and we can meet to discuss.

Declan O'Sullivan

Prof. Declan O'Sullivan

I am generally interested in projects that have to cope with diverse data, both structured and unstructured.

Typically the projects I supervise use AI-based knowledge driven techniques such as linked data, semantic web, semantic mapping, and XML-based technologies such as RDF, SPARQL, OWL and SPIN.

If you have a project you think might be of interest to me that uses such techniques, email me with a description and your idea.

There are also opportunities to work on aspects of research ongoing within the Science Foundation Ireland ADAPT centre focusing on Digital Content Technology, and ongoing digital projects with the TCD Library and Ordnance Survey Ireland .


In any case, some specific projects on offer this year include:

Topic 1: Open Data Quality Assessment

Assistant Supervision: Dr. Jeremy Debatista

Organisations are publishing their datasets frequently in an open fashion so that they can be reused by third parties for various tasks. For example, government agencies are releases their data as a transparency measure, with data journalists taking this data to scrutinise the government's policies. Whilst having this open data is an important step in this data-centric world, it's quality is of equal importance. This project looks at extending a state of the art quality assessment framework Luzzu [1] to include the assessment of open data formats such as CSV, XML, XLS and JSON amongst others. Furthermore, an open data quality monitor service have to be developed in order to constantly monitor and assess the quality of datasets in open data portals such as datahub.io [2,3]. For the latter, a survey is of the various quality metrics for open data will be conducted.

[1] Jeremy Debattista, Sören Auer, Christoph Lange: Luzzu - A Methodology and Framework for Linked Data Quality Assessment. J. Data and Information Quality 8(1): 4:1-4:32 (2016)

[2] https://datahub.io

[3] https://old.datahub.io/

Topic 2: Decentralised Social Network for Academics

Assistant Supervision: Dr. Jeremy Debatista

The aim of this project is to create an ecosystem for academics who are interested in contributing to Open Science [1]. Currently, portals such as ResearchGate acts as a "social network" for academics where one can create a profile and upload PDFs of their works. Nonetheless, whilst this can indirectly enable open access [2], this is still stored in a centralised system, making the uploaded artefacts to use within their discretion. Efforts have been made to encourage academics to write papers using tools such as Dokieli [3] and host the research themselves, however, there are still a lot of barriers when it comes to setting up, collaborating, and sharing. To potentially overcome some of the challenges, in this project we will build a decentralised social network for academics based on Sir Tim Berners-Lee's Solid [4] architecture. This will give the user all the power to decide what to do with his/her data. We will look how to create profiles using well known schemas and implement dashboards (e.g. for microblogging [5] or sharing of presentations) where users can interact (refer to W3C Recommendation Linked Data Notifications [6]) with their acquaintances. Furthermore, each user will have the opportunity to write their own research paper using a friendly interface which underlies the Dokieli (or similar) markup language. Additional functionality can include the mechanisms to enable concurrent collaboration in paper writing. Finally, once published online, these paper can be transparently reviewed by other academics connected to the network.

[1] https://en.wikipedia.org/wiki/Open_science

[2] https://en.wikipedia.org/wiki/Open_access

[3] https://dokie.li/

[4] https://solid.mit.edu/

[5] https://en.wikipedia.org/wiki/Microblogging

[6] https://en.wikipedia.org/wiki/Linked_Data_Notifications / https://www.w3.org/TR/ldn/

Topic 3: Converting 3D non-geospatial building geometries into 2D geospatial data.

Assistant Supervision: Dr. Kris Mc Glinn

Currently, building geometry (i.e. the geometry of a buildings walls, windows, columns, etc.) are usually stored using tessellated geometries. This project will investigate the conversion of these geometries into 2D geospatial data. This will allow geospatial functions to be run over existing building geometries, and also potentially reduce the number of conversions of points required when transforming geometries (as each point will be relative to one 'universal' coordinate system). The project will examine the conversion of Industry Foundation Classes geometries into GeoSPARQL and well-known-text (WKT).

Resources - https://github.com/kmcglinn/IfcOwl2IfcOwlGeoloc

https://github.com/jyrkioraskari/IFCtoLBD/tree/geo

http://ifc2bot.adaptcentre.ie/IFC2BOTWeb/

**ALREADY TAKEN** Topic 4: Supporting integration of building data based upon geospatial data.

Assistant Supervision: Dr. Kris Mc Glinn

This project will examine whether it is possible to use existing geospatial functions supported by GoeSPARQL to query building data with a geopspatial component (geolocation). The purpose is to determine if two locations are similar enough to be able to state that they represent the same building. This will require an investigation of the viability of the geospatial functions to support this, or alternatively, whether a bespoke method is required. For both methods, scalability must be determined, i.e. how to manage large datasets of geolocations. Also, what distance can safely be used before two building locations can no longer safely be said to be the same building. This project will also explore methods visualizing the integrated data, and displaying this in an intuitive way.

Resources - http://geovis.adaptcentre.ie/

Topic 5: Integrating 3D indoor models into building information standards.

Assistant Supervision: Dr. Kris Mc Glinn

This project will investigate the use of the structure sensor (structure.io) to automatically or semi-automatically generate and annotate 3D models to support integration into existing building information standards. This type of integration will make more detailed 3D Building Information Models (BIM) available to support use cases related to robotics, energy, etc.

Topic 6: Ethical Data Integration

This project will continue from a dissertation that was undertaken in 2017/18 on an approach to allow people undertaking integration between datasets to check whether there might be ethical issues involved in undertaking the integration.

Topic 7: Building a temporal knowledge graph of events from social media

Assistant Supervision: Dr.Fabrizio Orlandi

Online news and social media provide a vast, semi-structured and dynamic source of reference knowledge about current events and temporal information. Knowledge graphs have been shown as key solution for structuring large amounts of information on the Web and facilitating semantic analysis and AI algorithms. However, existing knowledge graphs (e.g. DBpedia, Wikidata, Google Knowledge Graph, etc.) focus mostly on static entity-centric information and are insufficient in terms of their coverage with respect to events and temporal relations.

This project aims at addressing this gap by integrating dynamic and event-centric information into a structured knowledge graph. The resulting KG will include events and their descriptions with temporal relations extracted from Social Media APIs or other semi-structured sources.

The project will start surveying existing approaches for representing event-related temporal data as Linked Data (e.g. [1]) and defining the appropriate representation model. Then, an dynamic approach that continuously extracts semi-structured data from Web APIs and transforms it into the chosen representation model will be developed. The results will be evaluated against existing static state-of-the-art approaches [2]. -

[1] https://arxiv.org/abs/1804.04526

[2] http://adimen.si.ehu.eus/~rigau/publications/jws2016.pdf

Professor Carol O'Sullivan

Updated: 17/09/2018

My list of projects will be updated by Friday Sept 21st.

Prof Owen Conlan

Location: O'Reilly Institute, Room F.29 Phone: +353-1-8962158

50 Years of Computer Science at TCD

Available This year marks the 50th anniversary of Computer Science at Trinity College Dublin. This project will develop a knowledge-based interactive and personalised commemorative app that will allow users to interact with the rich history of the School of Computer Science and Statistics and to explore the exciting future directions Computer Science research at TCD is promoting.

In the first instance, contact Prof Owen Conlan

On-Mobile Privacy-sensitive Personalisation

Available Current personalisation techniques, e.g. the tailoring of content to an individual user's preferences, rely heavily on server solutions that require potentially sensitive information about the user to be stored remotely. With the advent of more powerful mobile devices the potential to achieve high degrees of personalisation, using existing approaches, on the mobile device is significant. This project will explore the design and development of a personalisation framework that is deployed on device and does not share user model information with third parties.

In the first instance, contact Prof Owen Conlan

Augmented Video Search

Available Searching for content within videos is difficult. Current techniques rely heavily on author-created metadata to discover the video as a whole, but there few solutions to searching within the video. This project will explore how off-the-shelf multimodal (i.e. image analysis, audio feature detection, speech-to-text) techniques may be used to support search within a video.

In the first instance, contact Prof Owen Conlan

Visual Search

Available Modern internet based search prizes precision over recall, striving to present users with a select few relevant resources. Users can quickly examine these resources to determine if they meet their needs. There are other situations, such as patent search or performing research on Medieval corpora, where recall, i.e. retrieving all relevant documents, is essential. This project will examine visual techniques to support users in determining and refining recall in a search environment. The project builds on over 10 years of Personalisation, Entity-based Search and Visualisation work surrounding the 1641 Depositions and more recent work on the 1916 Pension Statements.

In the first instance, contact Prof Owen Conlan

Supporting the construction of Visual Narratives

Available Research in narrative visualisations or visual narratives has been growing in popularity in the Information Visualisation domain and in online journalism. However there is limited support offered to authors in constructing visual narratives, specifically non-technical authors.

This project will aim to advance the state of the art in visual narrative construction by supporting authors to build visual narratives, namely the visualisation in the narrative including automatic sequencing between the visualisations.

In the first instance, contact Dr Bilal Yousuf

Ethics-by-Design

Available The ethical implications of modern digital applications is growing as they encroach on more and more aspects of our daily lives. However the techniques available for analysing such ethical implications struggle to keep up with the pace of innovation in digital businesses, and tend to require the mediation of a trained ethicist. The Ethics Canvas is a simple tool to enable application development teams to brainstorm the ethical implications of their designs, without oversight of a trained analysts. It is inspired by Alex Osterwalder’s Business Model Canvas which is now very widely used in digital business formation. The Ethics Canvas exists both as a paper-based layout and as a responsive web application (see https://www.ethicscanvas.org/). Currently the online version can only be used by individuals, but cannot be used in the collaborative mode that is a key benefit of paper version. This project will extend the ethic canvas implementation to support remote collaborative editing of the canvas. User should be able to form teams and then review, make changes, comment on discuss, accept/reject changes and track/resolve issues. Further, the digital application development community could benefit from sharing previous ethical analyses using the online ethics canvas. The benefit of such sharing would be magnified if it led to a convergence in the concepts used in different canvas analyses. Therefore the project will allow teams to publish their canvas into a public repository and to annotate its content with tags from a shared structured folksonomy, i.e. a community-formed ontology capturing concepts such as different types of users, user groups, personal data, data analyses, sensor data, and risks. Within an individual canvas, tags can be used to link entries in different boxes to provide more structure to the canvas. The aggregation of tags from different completed canvases forms a folksonomy that can be made available as an open live linked-data data set and searchable by ethics canvas users."

In the first instance, contact Prof Owen Conlan



Dr. Rachel McDonnell

Updated: 20/09/2018

I am an Assistant Professor in Creative Technologies and I am available to supervise FYP and MSC projects in the area of Computer Graphics. I am interested in all aspects of computer graphics, but particularly in the animation, rendering and perception of realistic virtual humans.

I have a range of projects on offer (see below). These project involve developing graphics applications in Game engines (such as Unreal Engine 4). I am also open to novel ideas for projects from students on virtual humans or Virtual Reality. Note: some of these projects could be adapted for MSC or FYP level.

[AVAILABLE FYP Level] Avatar Facial Mimicry

When you are sad, it makes me sad! This project involves creating an application that can recognize a users facial emotions (from a phone or webcam) and apply them to a talking avatar. The student should investigate an appropriate open-source facial emotion detection software, integrate it into a game engine, and alter the emotion of a talking-avatar in real-time to mimic the emotion of the user.

[AVAILABLE FYP Level] Character facial animation using IPhone X

In this project, you will create an application in UE4 using the IPhone X depth camera to drive a face in real-time. See this video example.

[AVAILABLE MSC Level] Motion Capture analysis for gesture animations

Develop a machine learning system to automatically segment speech-gesture sequences. The system should receive motion capture data of a speaker as an input and determine the start and end frame of individual speech gesture motions.

[AVAILABLE MSC Level] Speech animation from voice

In this project, you will use a deep learning approach to generate appropriate voice animation for a virtual character from the speech input. See this video, for a comparison of the state of the art.

[AVAILABLE MSC Level] Estimating 3D Shape from a single photo

In this project, you will automatically create a facial geometry mesh to match an input photograph, using a database of face models. See this paper, Section 3 and this paper for an idea on how to do this. Also, see how to create a morphable face model here.

[AVAILABLE FYP Level] Character portrait lighting in Virtual Reality

This project will explore the effects of character and environment lighting and how it can be adapted to increase the appeal of a character in a VR open-world environment. The student will incorporate lighting theory from cinematography and psychology research into a CG environment, in order to enhance the character's appeal and emotional expressivity in real-time.

[AVAILABLE MSc Level] Expression Transfer for Virtual Humans

In this project, you will work on speeding up the process of creating digital doubles (virtual replicas of real people) by applying deformation transfer to scanned expressions from one character to another. See this paper for example.

[AVAILABLE FYP Level] Realistic Virtual Human Rendering in Real-time

Virtual humans are becoming increasingly more realistic, even in real-time. In this project, you will focus on photorealistic faces, using the most up to date techniques for skin, eye and hair rendering from Unreal Engine 4. (e.g., using high resolution scanned data from the Digital Emily project and the Meet Mike character from UE4)

Please e-mail me at ramcdonn [at] scss.tcd.ie if you are interested in any of my projects, or if you have your own graphics project proposal that you would like to discuss with me. Strong technical skills will be necessary for these projects.

Dr. Rob Brennan

Senior Research Fellow, ADAPT Centre, School of Computer Science and Statistics.
Email:rob.brennan@scss.tcd.ie
My projects are in the areas of data quality, data governance and data value with an emphasis on graph-based linked data or semantic web systems. Please note that I am unlikely to supervise your own project ideas due to current commitments. These projects are only for MSc students.

Projects

TAKEN 1. Extracting Data Governance information and actions from Slack chat channels

Main contact: Dr Alfredo Maldonado(address corrected on 26/9/2017)
Data governance means controling, optimising and recording the flow of data in an organisation. In the past data governance systems have focused on formal, centralised authority and control but new forms of enterprise communication like Slack need to be leveraged to make data governance more streamlined and easier to interact with. However systems like Slack produce vast amounts of unstructured data that are hard to search or process, especially months or years later. Thus we need a way to extract the most relevant conversations in Slack and turn them into structured data or requests for specific data governance actions like a change in a data sharing policy. This project looks at ways to extract relevant conversations and turn them into data governance actions via an interactive Slack bot that uses machine learning and natural language processing to identify relevant conversations and then interjects in Slack conversations to prompt users to interact with a data governance system.
This project is conducted in collaboration with Collibra Inc., a world-leading provider of data governance systems.
Keywords: Natural Language Processing, Machine Learning, Python, Data Governance

NO LONGER AVAILABLE 2. Automated Collection and Classification of Data Value Web Content

Main contact: Dr Rob Brennan
Jointly supervised with: Prof. Seamus Lawless
This research aims to automate the collection and classification of discussions of data value (e.g. "How much is your data worth?", "Data is the new Oil!") on sites like Gartner or CIO.com. This will compliment our traditional survey of academic papers discussing data value managment. The project will attempt to identify from the web content: the most important dimensions of data value (eg data quality), metrics for measuring them, the different models of data value proposed by authors and applications of data value models.The research will explore new ways to classify and conceptualise the domain of data value. Ranking dimensions for importance is also an interesting potential challenge. the project may also consider how to best structure the conceptualisation of the domain for different roles or types of consumers.
Keywords: Information Retrieval, Natural Language Processing, Knowledge and Data Engineering

TAKEN 3. Adding W3C Linked Data Support to Open Source Database Profiling Application

Main contact: Dr Rob Brennan
Jointly supervised with: Dr. Judie Attard
The Data Warehousing Institute has estimated that data quality problems currently cost US businesses more than $600 billion per year. Everywhere we see the rise in importance of data and the analytics based upon it. This project will extend open source tools with support for new types of web data (the W3Cs Linked Data) and sharing or integrating tool execution reports over the web.
Data profiling is an important step in data preparation, integration and quality management. It is basically a first look at a dataset or database to gather statistics on the distributions and shapes of data values. This project will add support for the W3Cs Linked Data technology to an open source data profiling tool. In addition to providing traditional reports and visualisations we want the tool to be able to export the data profile statistics it collects using the W3Cs data quality vocabulary, and data catalog vocabulary. These vocabularies allows a tool to write a profile report as Linked Data and hence share the results with other data governance tools in a toolchain. This will be an opportunity to extend the use of this vocabulary beyond pure linked data use cases to include enterprise data sources such as relational databases.
Keywords: Knowledge and Data Engineering, Java programming, Linked Data, Data Quality

TAKEN 4. Ethical Web Data Integration

Main contact: Dr Rob Brennan
Jointly supervised with: Prof. Declan O'Sullivan
In an era of Big Data and ever more pervasive dataset collection and combinationhow do we know the risks and if we are doing the right thing? This project will investigate the characteristics and requirements for an ethical data integration process. It will examine how ADAPT's semantic models of the GDPR consent process can be leveraged to inform ethical decision-making and design as part of the data integration process. This work will extend the ADAPT M-Gov mapping framework.
Keywords:
Ethics, Knowledge and Data Engineering, Java programming

TAKEN 5. Automatic Identification of the Domain of a Linked Open Data Dataset (New 25/9/2017)

Main contact: Dr Rob Brennan
Jointly supervised with: Dr Jeremy Debattista
As the Web of Data grows, there are more and more datasets that are becoming available on the web [1]. One important challenge in selecting and managing these datasets is to identify the domain (topic area, scope) of a dataset. Typically a dataset aggregator (such as datahub.io) will mandate that minimal dataset metadata is registered along with the dataset but this is often insufficient for dataset selection or classification (such as the dataset types used by the LOD cloud).
The aim of this dissertation topic is to create a process and tools to automatically identify the topical domain of a dataset (using metadata, querying the dataset vocabularies and clustering using ML algorithms). Thus it will go beyond traditional Semantic Web/Linked Data techniques by using a combination of ontology reasoning or queries and machine-learning approaches. Given an input dataset from datahub.io, LODlaudromat or the weekly dynamic linked data crawl (http://swse.deri.org/dyldo/data/), the datasets should be categorised in a specific topical domain so that consumers can filter this large network according to their needs.
Keywords: Knowledge and Data Engineering, Machine Learning
[1] http://lod-cloud.net
Further Reading
[2] http://dws.informatik.uni-mannheim.de/fileadmin/lehrstuehle/ki/pub/SchmachtenbergBizerPaulheim-AdoptionOfLinkedDataBestPractices.pdf
[3] http://data.dws.informatik.uni-mannheim.de/lodcloud/2014/ISWC-RDB/
[4] http://linkeddatacatalog.dws.informatik.uni-mannheim.de/state/

NO LONGER AVAILABLE 6. Automated Selection of Comparable Web Datasets for Quality Assurance (New 25/9/2017)

Main contact: Dr Rob Brennan
Jointly supervised with: Dr Jeremy Debattista
Many open Linked Data datasets suffer from poor quality and this limits their uptake and utility. There are a now number of linked data quality frameworks, eg Luzzu[1], designed to address the need for data quality assessment and publication of quality metadata. However in order to apply some quality measures, e.g. "Completeness Quality"[2], it is necessary to have a comparable dataset to test against. For example, the comparable dataset could form a Gold Standard or benchmark which can be used to compare with other similar data.
This project will investigate the methods required to (1) identify the requirements for a comparable dataset based on a specific set of quality checks and a dataset to be tested, and (2) then use these requirements to find the best possible dataset to act as a Gold Standard from a pool of open datasets such as datahub.io. Example requirements may include matching the domain, ontology language, presence of specific axiom types, ontology size, ontology structure, data instances present and so on.
Keywords:Knowledge and Data Engineering, Data Quality
[1] http://eis-bonn.github.io/Luzzu/
[2] http://www.semantic-web-journal.net/system/files/swj773.pdf

TAKEN 7. Data Quality Dashboard (New 29/9/2017)

Main contact: Dr Rob Brennan
Jointly supervised with: Dr Jeremy Debattista
The Luzzu data quality assessment framework is a flexible, open source Java-based toolset for assessing the quality of Linked Data that is now being maintained by the ADAPT Centre at TCD. Luzzu supports semantic reporting of quality assessments by using the dataset quality vocabulary [2], the quality problem ontology and the Luzzu metric implementation ontology. However it is still a command-line tool and the semantic reports it generates are optimised for machine readability. In this project we will build a data quality dashboard that visualises the semantic outputs of Luzzu and makes it easy for quality manager or data stewards to infer the implications of a data quality assessment task.
Keywords:Knowledge and Data Engineering, Data Quality, User Interface Design
[1] http://eis-bonn.github.io/Luzzu/
[2] http://theme-e.adaptcentre.ie/daq/daq.html

Dr. Marco Ruffini

Final year projects:

End-to-end capacity reservation in Software Defined Networks

Software Defined Networks have revolutioned computer networks, by introducing a means to enhance network programmability through the use of standardised and open access interfaces. The aim of this project is to implement an end-to-end capacity reservation mechanism across aggregation and core networks based on the use of stacked Multi-Protocol Label Switching (MPLS) labels. User requests are forwarded to a centralised controller that takes into account available capacity to allocate the requested capacity over an end-to-end link. A background in network programming and python programming language is strongly advised

Prof. Siobhan Clarke

Room

Extension

Lloyd 1.17

2224

I am interested in software systems that make cities smarter! Below are some examples that I am co-supervising. If you have any ideas of your own for smart cities - e.g., smart transport, smart energy management, smart water management, do please contact me, as I am happy to supervise projects in this area.

Co-Supervised with Dr. Mauro Dragone

HeatMap App (Participatory version)

The goal of this project is to build an Android application that can be used to assess the number of users present at the entrance of museums, shopping malls, in buses and around bus stops, art exhibitions, car parks, or any other public, shared places where people occasionally congregate and/or queue. To this end, the student will build a solution using one of the available frameworks for peer-to-peer communication between multiple handsets [1].

HeatMap App (Vision version):

The goal of this project is to build a system that is able to estimate the length  of the queue of visitors waiting to enter a museum, art exhibition or other place of public interest,  such as the Old Library and the Book of Kells Exhibition in Trinity College. The student will use a Galileo single-board computer and a pan & tilt camera, and will develop a computer vision algorithm using the OpenCV library [2] to segment, track and count people in the queue. There is also scope to develop adaptive solutions to account for different visibility conditions, and to build an Android application.

Bus Tracker:

The goal of this project is to build an Android application to infer and gather useful knowledge about the travel habits of users carrying smart mobile phones. Specifically, the target application should be able to recognize which public transport route (e.g. train, bus, LUAS), and between which stops  the user is currently traveling. The student will use current publish/subscribe and middleware for location-aware applications, such as [3][4][5], and ]investigate the adoption of machine learning techniques, such as neural networks, to classify routes based on the analysis of  streams of noisy sensor data.

Extension of Funf:

The Funf Open Sensing Framework [6] is an extensible sensing and data processing framework for Android mobile devices. The core concept is to provide an open source, reusable set of functionalities, enabling the collection, uploading, and configuration of a wide range of data signals accessible via mobile phones. The goal of this project is to extend Funf with support for peer-to-peer communication between multiple handsets, in order to enable the coordination of the efforts of multiple users involved in participatory sensing campaigns.

Urban GeoLocation:

The goal of this project is to assess and improve the ability to locate users carrying smart mobile phones while driving, cycling, or simply walking along urban pathways. In particular, the student will tackle the problems suffered by GPS-based location in urban environments, where the signals from the positioning satellites are often blocked or bounced off buildings and other structures. Contrary to existing approaches which try to explicitly account for these  phenomena, the student will assess the benefits of using multiple sensor data and the feedback gathered from multiple users over time, to build solutions that are able to exploit the power of the crowd to acquire complex models and improve their accuracy over time. The work will require the student to familiarise themselves with Particle Filter [7] as the overall framework that is likely to be used to integrate the various components of this project.

SensorDrone:

The goal of this project is to develop an Android application using the sensordrone kit [8]. Sensordrone is a modular device the size of a key-chain, equipped with temperature, luminosity, 3-axis accelerometer and air-quality sensors. The device can be paired with the users' mobile phone over low energy bluetooth. A number of useful applications may be built by exploiting the combination of sensors available on the Sensordrone and the sensors and the geolocation functions available on the user's smart phone. Of particular interest are applications targeting:

  • Road quality information - is the road deteriorating in specific locations? E.g. early pothole formation identification
  • Bike scheme monitoring - real time info. on where and when the cycle fleet is being used and what the cycles are encountering.
  • Map urban pollution data information - noxious gases, noise, temperature.
  • Cyclist routing - using information on pollution, journey times for bikes, stats on areas where cyclists swerve or brake suddenly.
  • Localised weather alerts for cyclists (and potentially data collection on the device)

Smart Home Projects:

Project ideas are also welcome for projects addressing the development of smart home services and their integration within city-wide participatory sensing frameworks [9]. The student will be required to develop software prototypes for the OpenHAB open source software platform for home automation [10]. A range of hardware is available for these projects, including a  single board computer and home automation sensors and actuators, such as occupancy sensors, energy monitors and wireless switches.

Links to relevant technologies and further readings:

[1] Peer-to-peer frameworks for Android:http://code.google.com/p/p2p-communication-framework-for-android/,https://code.google.com/p/peerdroid/,http://developer.android.com/guide/topics/connectivity/wifip2p.html, https://github.com/monk-dot/SPAN
[2] OpenCV: http://www,opencv.org/
[3] MQTT: http://mqtt.org/
[4] OwnTracks: http://owntracks.org/
[5] Google Pay services for Android developers: https://developer.android.com/google/play-services/location.html
[6] Funf: http://www.funf.org/about.html
[7] Particle Filter: www.igi.tugraz.at/pfeiffer/documents/particlefilters.pdf ,
[8] SensorDrone:
http://www.sensordrone.com/
[9] CityWatch: http://www.citywatch.ie
[10] OpenHAB: http://www.openhab.org

Co-Supervised with Dr. Ivana Dusparic

Smart energy grid: Intelligent Residential Demand Response

The European Union's 2050 roadmap is resulting in the increasing penetration of renewable energy sources and electric vehicles (EVs) in Europe. In Ireland, it is expected that 80% of electricity will come from renewable sources by 2050, and 60% of new cars sold in 2050 will be electric. As a consequence, the electrical energy grid is facing significant changes in the supply of resources as well as changes in the type, scale, and patterns of residential user demand.

In order to optimize residential energy usage, demand response (DR) techniques are being investigated to shift device usage to the periods of low demand and to the periods of high renewable energy availability. DR refers to modification of end-user energy consumption with respect to their originally predicted consumption patterns.

This project will investigate use of intelligent learning-based techniques in implementation of large-scale DR aggregation techniques suitable for residential customers. Some of the aspects to be addressed within the scope of the project include: household energy use learning and prediction (as enabled by e.g., smart meters or smart heating devices like Nest and Climote), evaluation of centralized vs decentralized DR approaches, responsiveness of techniques to different usage patterns and different renewable energy generation patterns, types of devices most suitable for DR programmes (e.g., heating, EVs) etc.

Smart energy grid: Home energy usage prediction and optimization based on sensor data

This project will investigate how can home energy usage be learnt, predicted and optimized. Patterns of energy use can be learnt and predicted based on historical occupants' behaviours (e.g., learning that the user generally leaves for work at 8:15am, plays football after work on Wednesdays, goes out straight after work on Fridays etc), combined with various sensors and data sources to provide more accurate amended predictions (e.g., mobile phone calendar, GPS location of the user, level of battery charge in electric vehicle, outside temperature etc). Use of learning and intelligent agent techniques will be investigated and applied to learning the observed patterns and establishing demands and constraints on user device usage (e.g., the duration of charging an electric vehicle will require based on the duration of daily trip, the time heating needs to be turned on to achieve optimal temperature by user arrival time, estimated time why which hot water is required for the shower etc). Multi-objective optimization techniques will then be applied to schedule required device usage so as to satisfying their use requirements and constraints as well as desired policies set by users (e.g., minimize energy price, maximize use of renewable energy etc).

Untitled Document

Prof. Seamus Lawless

I supervise projects in the areas of Information Retrieval, Personalisation and Digital Humanities. The common focus of these projects is the application of technology to support enhanced, personalised access to knowledge. If you would like to talk about a project or suggest one in these areas, email me at seamus.lawless@scss.tcd.ie

Project Details The following are a list of Projects available for the 2018-2019 academic year:

Information Retrieval and Web Search

Word Embeddings for Improved Search in Digital Humanities
If I gave you a document that was written entirely in Japanese Kanji and asked you to tell me which symbols had a similar meanings, how would you do it (assuming you cannot read Kanji)? Even for a human, finding a realistic answer to this question is extremely difficult. Yet this question reflects a fundamental problem with how computers perceive texts. Unless a human annotator provides some form of descriptive mark-up, a computer simply does not understand the meaning behind the text it curates. Word embeddings are a recent development in the text analysis community. By applying a family of algorithms collectively known as Word2Vec a computer is able to examine a large collection of documents and derive relationships between words based solely on their contextual usage (e.g. the word "King" has some strong association with the word "Queen". Also, the vectors produced are additive and subtractive - by subtracting "Man" from "King" and adding "Woman" to the result, we will obtain a vector which is extremely close to the the vector for "Queen"). This Masters project aims to investigate the use of word embeddings in supporting better search and exploration of a collection of 17th century historical documents. This may involve generating suggestions for alternative query formulations in a search interface. In more advanced terms, we may seek to build a retrieval model based on the word vectors generated by Word2Vec.

Text Segmentation using Explicit Semantic Analysis for Information Retrieval
Text segmentation is the process of placing boundaries within text to create segments according to some task-dependent criterion. It aims to divide text into coherent segments which reflect the sub-topic structure of the text. Explicit Semantic Analysis (ESA) is a method that represents meaning in a high-dimensional space of concepts, automatically driven from human-built knowledge repositories such as Wikipedia. State of the art approaches to text segmentation use the lexical representation of text (i.e. terms in text) to segment it. Thus, in this project, the semantic representation of text (i.e. using concepts instead of terms) using ESA will be used instead. The output of the segmentation task (could be linear or hierarchical) will be indexed based on concepts produced from the semantic representation of text. (Egozi et al., 2011) proposed a concept-based indexing and retrieval approach based on ESA where they indexed documents as text segments and each segment was indexed based on its conceptual representation from the concept space they have built from Wikipedia. However, in their approach, they relied on segmenting each document based on a pre-defined window. In other words, all segments produced from all documents had the same size. Thus, the aim of this project is to apply text segmentation (using ESA) to split each document into coherent segments where each segment is represented by concepts (instead of terms). After that, each segment will be indexed based on such conceptual representation and the retrieval task will be based on these segments. The main idea is: for a text query, convert it into a vector of concepts using ESA (i.e. for each term in the query, map it to its relevant concepts in the concept space). Using this vector, retrieve segments that best match the concepts in that vector. Combine the score of segments that belong to the same document and rank documents based on these accumulated scores.

References:
Egozi, O., Markovitch, S. and Gabrilovich, E. (2011) ‘Concept-Based Information Retrieval Using Explicit Semantic Analysis’, ACM Transactions of Information Systems. New York, NY, USA: ACM, 29(2), p. 8:1--8:34. doi: 10.1145/1961209.1961211.

Data Analysis and Data Science

Twitter as an alternative Review Site
People tend to be spontaneous in what they post to Twitter. They express relatively unfiltered opinions by giving immediate voice to their daily experiences. Tweets can often take the form of a "review", giving an insight into the tweeter's opinion about what they are interacting with in the real world, be it a restaurant, hotel, TV show, celebrity etc., which is different from other on-line reviews (ratings/comments). For instance, if someone tweets an image of a great view of the Eiffel tower from a hotel balcony, it indirectly indicates that this hotel could be of interest for someone looking for accommodation close to the Eiffel tower with good views. But Twitter data is notoriously "noisy". People often do not indicate the full/proper name of an entity (e.g. a hotel name) in their tweet. Extracting/recognizing an entity from a tweet and analysing the sentiment towards that entity is a challenging problem. This project focuses primarily on the detection of review-like tweets from a specific geographic area, such as Dublin. If time permits it will focus on analysing the sentiment expressed towards entities present in those tweets.

#ITK #DoneDeal #FakeNews
The summer Transfer Window is a busy time for Football clubs, supporters and the media. This is reflected in the volume of activity on Social Media platforms such as Twitter. Clubs are continually linked with signing and releasing players, rumours circulate that clubs are interested in particular players and making moves to recruit them. A very small fraction of these rumours actually come to pass. There are lots of Twitter accounts which claim to be #ITK - "In The Know", is this actually the case? I am interested in collecting a large Twitter dataset related to the Summer Transfer Window, particularly focused on the English Premier League. I would like to apply Machine Learning techniques to that dataset to look for patterns in the tweets and to preform an analysis of the preformance of certain accounts in predicting transfers.

Moneyball 2.0
Fantasy sports games are played by an ever-increasing number of sports fanatics, Fantasy Football being a prime example. However, finding the best players to pick on your team week-in week-out is, for most, something of a lottery. Players are chosen based on many different sources of information, including: what journalists write about how they have performed; what fans write on social media; and by analysing statistics (e.g. number of goals scored, assists, etc.). In this project we propose to look at both the players' performance and the social aspects. For the former, we analyse various statistics to get an informed decision on how a player actually performed in a match, potentially identifing some key indicators, similar to the approach described for Baseball in the book, "Moneyball". For the latter, we can apply sentiment analysis to social media streams during match days (most twitter posts can be identified using a game hashtag, e.g. #SOUEVE) and news reports to enhance our data analysis. The idea is that with such information, an algorithm could predict who should be transferred out, and who should be transferred in, for the following game week in order to give the maximum return in points.

There are a number of projects available in the area of Credibility, Bias and Trust. These include:
  • Credibility is only Skin Deep!
  • Initial Impressions of the Credibility of Different Categories of News Websites
  • The Impact of Content Focus on Perceptions of Credibility
  • The Impact of 'Information Scent' on the Perception of Credibility
  • Bias in Sports Broadcasting During International Rugby Games
  • A Comparison of the Favourableness of Photographs of Political Figures Versus Headlines and News Articles
  • Length is Strength: How the Proportion of Space a News Webpage Dedicates to the News Article Versus Other Elements (such as Advertising, Tweets, and Promoted Content) Influences Perceived Credibility
  • Bias in the Imagery used by Political Parties

Brief detail on each of these projects can be found below, for a more detailed description please email me.

Credibility is only Skin Deep!
A study to investigate whether there is a correlation between the number of pictures on a news website's homepage, and the proportion of skin they show, and a reduction in the perception of credibility of that site.

Initial Impressions of the Credibility of Different Categories of News Websites
A study to measure users' initial impressions of the credibility of news articles from different categories of news websites (quality press, tabloid, online only etc.) in order to determine how each of these visually distinct styles of presentation affect percieved credibility.

The Impact of Content Focus on Perceptions of Credibility
A study to determine the impact of the category of news website (quality press, tabloid, online only etc.), determined by its visible content focus, on the perception of credibility.

The Impact of 'Information Scent' on the Perception of Credibility
It has been claimed that the technical features of a webpage, such as related or supporting content, sources, and indicators of popularity and attention act as "Information Scent", cues by which a user may form judgements of credibility. This project is to design and execute a crowdsourced experiment to investigate the impact of features of a webpage which emit information scent on judgements of credibility.

Bias in Sports Broadcasting During International Rugby Games
The project involves the design and development of an experiment to measure whether the sports reporting in any nation is biased in favour of their team, by how much, and how consistent it is, specifically in the context of international rugby games. The research should also analyse reporting from neutral nations.

A Comparison of the Favourableness of Photographs of Political Figures Versus Headlines and News Articles
This project involves the design and development of an appropriate crowdsourced experiment to measure the favourableness of imagery of several political figures over a defined period of time, the conducting of textual analysis of the reporting of events, and the prominence of the reporting. The project will also involve the suitable visualization of the results produced

Length is Strength: How the Proportion of Space a News Webpage Dedicates to the News Article Versus Other Elements (such as Advertising, Tweets, and Promoted Content) Influences Perceived Credibility
This project involves the design of an experiment to test whether the amount, structure, and proportion of news article text on a news webpage influences the perceived credibility of the news article. This project will also examine sentence level elements such as the use of data and figures in the content.

Bias in the Imagery used by Political Parties
This project will involve a study into the images used in election material to investigate which political parties used the least flattering images of their opposition party members, for political purposes.

Website/Application Design and Development

Development of a Prototype Research Ethics Application System
Research Ethics approval is a necessary step to conduct any experimentation involving people, as part of research in the School of Computer Science and Statistics in Trinity College. Its importance to high quality research cannot be overestimated. However, the current application process is cumbersome, unwieldy, and often produces sub-standard applications. The aim of this project is to create a system which streamlines the creation of high quality, standardised research ethics applications.

Redevelopment and Analysis of a Repository and Classification of Credibility Measures
In recent years, work has been ongoing on the development of a repository and classification of measures used in empirical research to measure the concept of credibility in online experiments. This repository and classification can be seen in this table. This project will involve the development of a new dynamically generated website and table to display the existing data in a more easily consumable form. It will require the design and development of a database query interface to enable users to engage in faceted search and display of content. The data should be analysed for invisible or hidden trends, and to producing visualisations of the content.

2018/2019 FYP/MSc project topics for Stephen Farrell

If interested send mail

Detailed project scope and goals can be adjusted to fit student skills and the level of effort available.

  1. Develop (parts of) an Internet scanning infrastructure tailored to Ireland

    Scanning public-facing Internet services in order to detect security- and privacy-relevant patterns and problems is becomming well-trodden ground. Typical studies attempt Internet-scale IPv4 scans, e.g., to detect uses of outdated ciphers in uses of the Transport Layer Security (TLS) protocol. More local scans (e.g., https://eprint.iacr.org/2018/299) could however produce results that are easier to translate into mitigations, where Internet service operators (e.g. web sites, mail server administrators) take action to improve their network posture. Building infrastructure to do such scans (e.g. using zmap/zgrab) in a repeatable and extensible manner, and providing ways to extract information that may be of use to asset-holders has numerous challenges. The overall project aims to build and operate such an infrastructure tailored for Ireland, but so that it could be replicated for any similarly-sized scans. Depending on expertise and resources available, an FYP or MSc dissertation could tackle various sub-problems in this space.

  2. Analysis of SSH and other remote login attempts recorded over 2018

    I manage a bunch of (~8) VMs with Internet-facing services. Those are constantly hit by SSH login attempts that I record. Some are occasionally hit (one in particular) with attempts to authenticate to email services. The project is to analyse the available logs (from Fail2Ban, denyhosts and the UFW firewall), compare those to other available public data and (resources/expertise permitting) to propose, implement and trial ways to improve (or replace) the open-source IDS tools in use.

  3. Implement and test a proof-of-concept for MPLS opportunistic security

    Multi-Protocol Label Switching (MPLS) is a sort-of layer 2.5 that carries a lot of Internet traffic in backbone networks. There is current no standard for how to encrypt traffic at the MPLS "layer." I am a co-author on an Internet-draft that specfies a way to opportunistically encrypt in MPLS. The task here is to implement and test that which will likely result in changes to the specification (perhaps adding the student's name to the eventual RFC). There are existing simulation/emulation tools that support MPLS and IPsec that should make implementation fairly straightforward, for example Open vSwitch. A proof-of-concept demonstration with performance figures vs. cleartext and IPsec is the goal. Comparison against MACsec would also be good but may be too hard to test in this environment. This is a continuation of a 2017/2018 project that succeeded in setting up a basic simulation setup with OpenVSwitch and mininet and implemented the bulk encryption, so this year's prioject will be more about the key exchange.

  4. Deploy local DPRIVE and DoH recursive resolvers and test performance

    DPRIVE is a specification for how to run the DNS protocol over TLS in order to attempt to mitigate the privacy problems with the use of the Domain Name System. DoH is similar but tunnels DNS queries and responses over HTTPS, and is supported by current releases of Firefox. There are implementations of DPRIVE and DoH available now that are ready for experimental deployments. The goal of this project is to deploy a local recursive resolver using these protocols to test effectiveness and efficiency via artificial test queries and responses but also, where/if possible, handling real DNS queries from clients that have opted in to being part of the experiment.

  5. Solar-powered LoRa gateway power management

    In January 2017 we deployed a solar-powered LoRa gateway in TCD. Power managment for that is relatively simple - the device aims to be "up" from 11am to 4pm each day, but handles low-power situations by sleeping until batteries have been sufficiently charged. The goal here is to analyse the power consumption and traffic patterns recorded since January 2017 in order to improve system up-time via enhanced power managment. (For example, the device could decide to not sleep from 4pm if the battery level is above a new threshold.) The current power managment daemon is a simple C program that monitors battery voltage and sets the device to sleep or wake according the simple policy described above. Modifications to that can be designed, validated based on existing data, and then implemented, deployed and tested with the existing hardware. A rPi-based implementation was the subject of a 2017/2018 project that should provide a useful starting point.

  6. Prototype TLS1.3 SNI Encryption

    The Subject Name Indicator (SNI) extension to TLS is extremely widely used to support multiple web sites on the same host (e.g. VirtualHosts in Apache2) but represents a major privacy leak as SNI has to be sent in clear, given that no scalable way of hiding SNI has been deployed. The TLS working group have just adopted a draft that describes various a way in which SNI can be protected should a web site wish to "front" for another. (Think of a CDN like akamai being "cover" for some human-rights organisation that would be censored in the relevant jurisdiction.) The goal here is to prototype and test this SNI encryption scheme in order to assist in standardising the mechanism.

  7. Play with leaked password hashes

    How fast can you check for the prescence of a password (hash) in a list of 320 million leaked hashes? The naive shell scipt I wrote in a few minutes takes 30 seconds on my laptop. The goal here is speed, without requiring any networking (so no sending password hashes over any network), on a reasonably "normal" machine. The list takes 12GB to store in a file, one hash per line. The list may also be updated occasionally, though in bulk, not as a trickle. Some side-channel resistance (e.g. considering timing, OS observation) is also a goal here. As you'd expect, the list was mostly reversed within a few weeks.

  8. Develop an application using Cryptech open-source HSM

    Cryptech is an open-source hardware project developing a cryptographic hardware security module. The project here is to develop an application using the alpha-release of the cryptech hardware. The application could be a DNSSEC signer or anything else that uses digital signatures with modest performance requirements.

  9. Assigned: MIFARE hacking

    Investigation of possible attack vectors on a simulated ticketing system using blank MIFARE DESfire EV1 cards.


Prof Aljosa Smolic


Thesis and Final Year Projects proposals 2018-2019


Please visit our web page:

V-SENSE Student Projects


https://v-sense.scss.tcd.ie/?page_id=1102

Dr. Stefan Weber

Room
Lloyd 1.41

My areas of interest include:

  • Distributed Systems
    • Scale/Containerized Environments
  • Self-managing networks
    • Mobile Ad hoc networks
    • Active Agents
  • Software-Defined Networking
  • Internet Alternatives
    • Information-Centric/Named-Data Networking
    • Recursive InterNetwork Architecture (RINA)
  • Network Security
    • Botnets
    • Honeypots
  • Internet-of-Things (IoT)
In general, I am happy to talk about any project ideas in these areas. Below are a number of projects that I proposed/that students worked on in the past:

Message Adaptation in the Internet of Things

Transport protocols such as the Transmission Control Protocol (TCP) were developed to support the transfer of files from a source to a destination. The development of the architecture of the Internet has The Internet of Things will present a significant challenge for the current Internet architecture if large numbers of sensors transmit individual small messages. This project will investigate the effect of small messages on routers in the current Internet architecture and develop a solution that attempts to prevent the flooding of networks with small messages.

Game Design using Information-Centric Networking

Information-centric networking enables the caching of content within infrastructure components in a network in order to reduce network traffic and latency. This project will investigate the design of a communication protocol for games based on an information-centric approach such as NDN/CCN.

Design Characteristics of Honeypots - with an Aim to Study Botnets

This project will investigate the design of honeypots and the available information of botnets and develop a design for a honeypot that will allow the gathering and evaluation of information about the distribution and design botnets such as their size, use, etc. As a means to study these botnets, a honeypot or set of honeypots should attract the attention of botnets, collect samples of the communication of these networks and provide an analysis of the collected traffic.

Adaptive Communication in Mobile Networks

Communication between sets of mobile devices in networks such as 3G/4G networks generally rely on well-known hosts that coordinate the communication between the devices. Similar to approaches to multicast communication, this project will investigate the development of a protocol that allows the adaptation of communication from an initial centralized model via a rendezvous node to a distributed model where the devices will communicate directly with one another.

swilson@tcd.ie or phone +1062. I am based in room 133, Lloyd Institute.


Large scale factor analysis with application to the Cosmic Microwave Background.

This project is best suited to an MSc Data Science student. The Cosmic Microwave Background (CMB) is a background radiation observed across the entire sky that is believed to have arisen at a very early age of the universe. A good understanding of its properties tells us a lot about how the universe must have evolved. A key problem in imaging the CMB is that it is contaminated by lots of other sources that must be separated from it. A team at TCD has developed a methodology, based on factor analysis, that does that at the scale of current images of the sky e.g. the satellite Planck, designed to image the CMB, has images of about 12 million pixels. There are several objectives to this project. First, the current code is written in C and we wish to develop a Python wrapper to this code that will allow others to use the sofware more easily. Then we wish to apply the method to data from Planck and compare the results with other source separation methods. Finally, if time permits, we want to look at a harder source separation problem in CMB called the polarisation separation problem and develop the method to allow it to be applied to this problem. This project's focus is on the use of Python and distributed computing to scale a factor analysis algorithm, as well as gaining a deeper understanding of the application domain in astronomy.

Brendan Tangney

Room

Extension

316, Lloyd Building

1223


  1. Brays Heuristics for Math Education. PROJECT ASSIGNED
  2. A Collaborative Tool to Scaffold the Ph.D. Process. CAWriter is a web-based computer supported collaborative working toolkit to support research students in the academic writing process (Byrne J.R. and Tangney B. 2012). This project will take an existing prototype, extend its capabilities and engage in a user study of the tools efficacy.
  3. A Collaborative Tool to Scaffold Skills Acquisition in Project Based LearningA specification has been developed (Ellis N., 2017) for a tool to assist instructors and students in a portfolio based approach to skills acquisition, in a manner similar to the collection of Scout Badges. This project will take that specification and design, implement and test the tool in a real learning setting.
  4. Argumentation Visualisation. Many arguments, particularly in Plato’s dialogues, have a clear structure or flow. For example backtracking occurs frequently when one partner in the dialogue presents a proposition only to have to later retract it and backtrack. This tool would, given some text,assist an instructor create a visualisation of the argument in that text.

Dr Tim Fernando

Room

Extension

ORI LG.17

3800

I offer projects on knowledge representation and/or natural language semantics. Apart from programming, some mathematical maturity would be useful to survey the research literature on formal methods in artificial intelligence. Below is a list of specific topics, but I am happy to discuss other interests you may have that are broadly related.

Timelines and the semiotic triangle

    Timelines (such as this) order events chronologically, while the semiotic triangle relates the world, language and mind. The aim of this project is to design timelines that distinguish between an event E in the world, a linguistic event S that describes E, and a reference point R representing a perspective from which E and S are viewed (following Reichenbachian accounts of tense and aspect).

Finite State Semantics

    How can we apply finite automata and transducers to represent meaning? More specifically, (i) how can we structure semantic information through the successor relation on which strings are based, and (ii) how far can finite state methods process that information? A project within this topic can take a number of directions, including intensionality, temporality, comics, and formal verification tools.

Frames and grounded cognition

    This project examines the role of frames in a general theory of concepts investigated by, for example, the DFG Collaborative Research Centre 991: The Structure of Representations in Language, Cognition, and Science. The focus is to test the idea of grounded cognition (e.g., Barsalou) against various formulations of frames. Related to this are notions of force (from Talmy to Copley) and image schema in cognitive semantics.

Constraint satisfaction problems and institutions

    The goal of this project is to formulate Constraint Satisfaction Problems (CSPs) in terms of institutions in the sense of Goguen and Burstall, analyzing relations between CSPs category-theoretically as institution (co)morphisms.

Textual entailments from a temporal perspective

    Computational approaches to recognizing textual entailments need to be refined when dealing with time. Temporal statements typically refer to a temporal period that cannot simply be quantified away by a Priorean past or future operator. The challenge facing this project is how to incorporate that period into approaches such as Natural Logic that have thus far ignored it.

Bounded granularity and incremental change: scales

    This project is for someone particularly interested in linguistic semantics. The project's aim is to examine whether or not granularity can be bounded in accounts of incremental change in natural language semantics involving scales (e.g. Beavers) as well as mereology and mereotopology. Attention will be paid to how to model refinements (and coarsenings) of granularity.

Monadic Second Order Logic for natural language temporality

    This project is for someone particularly interested in logic. A fundamental theorem in logic due to Büchi, Elgot and Trakhtenbrot identifies regular languages with the models definable in Monadic Second-Order Logic (for one binary relation). The aim of this project is to explore applications of this theorem to natural language temporality --- including tense and aspect.

Dr. Vasileios Koutavas

Areas of interest: programming language implementation and theory, concurrent and distributed systems, formal methods, software verification.

A number of projects are available in these areas which involve the development of part of a compiler, or the use of tools such as Why3 and Coq for verifying software systems. Theoretical projects are also available. These projects are particularly suitable for students who have taken at least one module on compilers, software verification or programming languages.

Dr Carl Vogel

Room

Extension

ORI.LG16

1538

All projects include review of the relevant literature, and where appropriate, argumentation in support of analyses given.

Note that implementation is not an essential component of every project in computational linguistics -- there's definitely more to the field than computer applications -- however, formal rigor is quite essential.

Don't worry if you don't recognize the name of the systems/languages mentioned. If the theme itself interests you we can sort out the technical details in person. Of course, these are all just suggestions, we're assuming that the final project description will be individually tailored in most cases.

Students who do projects with me will agree to regular weekly meetings at which we discuss the preceding week's work, and plans for the following week's. The initial weeks typically involve a considerable amount of diverse readings. Students intending to work with me on their project are encouraged to contact students who have done projects with me in the past. (See here for some details on that.)

Projects listed here are suitable for final year students on the CSLL/CSL course; students from other undergraduate and postgraduate courses may also find suitable topics here.

  1. Develop an HPSG (Head-driven Phrase Structure Grammar) grammar for a fragment of Irish and Implement it in the LKB focusing on the syntax of one of the following construction types:
    • Noun Phrases
    • Embedding Verbs
      1. proposition embedding verbs
      2. question embedding verbs

    Some examples of comparable projects are available for Irish, French, and German.
  2. Design and implement a chart parser for a CFG grammar with a dominance interpretation for phrase structure rules. This is essentially a framework for underspecified semantics. A disambiguation method must also be provided.
  3. Extend the semantic coverage in one of one of the frameworks included in the CLEARS (Computational Linguistics Education and Research Tool for Semantics) system.
    Particular areas of interest might be: negation, spatial modifiers, belief reports. An example of a project that did this in the past is available here.
  4. Extend the functionality of a generic interface for web-based experimentation in cognitive science (this will involve empirical research in an area of cognitive science to be agreed upon).
    This offers several possible topics and with varying degrees of implementational requirements. For all, some implementational extensions to the underlying system are necessary. Some will involve more or less actual experimentation using the system. Previous stages of the system are described, among other places, here, here, and here.
  5. Improve on the design and implementation of a web based multiplayer scrabble game, with the feature that point assignments to letters are calculated dynamically, on the basis of frequencies derived from corpora. A description of the base system is provided here. An extension of that work is described here here. There are many delightful ways in which this work can be extended. One example is including a facility for team play. Another is in implementing an automated player that humans can choose to play against.
  6. Extend and experitment with a platform for experimenting with models of dynamic systems, with particular attention to modeling evolution of linguistic behaviors. A starting point is described here, subsequent work is described here.
  7. Extend work on utilities for statistical analysis of linguistic corpora and apply them to specific tasks such as detection of grammatical errors, and automated correction suggestion.
  8. Develop and validate lexical resources for sentiment analysis.
  9. Develop methods within computational stylistics for investigating text-internal linguistic variables with external variables using large online textual resources. A comparable project is described. here.
  10. Develop methods for tracking events under varying descriptions in journalistic prose.
  11. Develop a Prolog implementation simulating the operation of theories in dynamic semantics.
  12. Develop a Prolog implementation of real-time belief revision systems.
  13. Extend an automatic crossword generator implemented in Java and Prolog. Documentation of its state in 2003 state is available here A more recent version is documented here. One avenue in which to extend this is to establish it as a system fully anchored on the Suns, with application in language learning and other topical areas.
  14. Develop online tools for other forms of fun with words -- an innovative anagram server, a crossword clue generator, etc.
  15. Formal syntactic and semantic analysis of dialogue. Example past attempts at this are available here and here.
  16. Extend a computational model of Latin morphology, with lexical lookup to achieve Latin to English Machine Translation.
  17. Extend a prototype grammar checker for Irish implemented in Prolog, integrating it with a spelling checker for Irish.
  18. Implement an efficient spelling checker for Irish in java, in the context of a webserver that collects words and their frequencies of use in checked documents, along with some other utilities for corpus linguistics.
  19. Incorporate an Irish language spelling checker and general proofing tools facilities into StarOffice/OpenOffice.
  20. Parse a large Irish-English Dictionary (the O Donaill). A description of the comparable project is provided here, and here.
  21. Projects in psycholinguistics. Past Examples appear here, here, here and here.
    Some specific topics I would like to explore further:
    1. Linguistic priming and unconscious coordination in written communication.
    2. Degrees of grammaticality and acceptability.
    3. Human reasoning with mildly inconsistent information.
    4. Computational stylistics (corpus driven syntactic and semantic analysis).
  22. Some general purpose utilities that can replicate standard offerings such as "DoodlePolls" and shared calendars, but with local data stores that accommodate varying levels of privacy and data protection.
  23. Develop tools to harvest from online sources a multi-lingual database of named entities.
  24. Build computational tools in support of structuralist analysis of myth and mythic-metaphorical representation (in the style of Levi Strauss).
  25. Test empirical dimensions of theories of holism in formulaic language associated with (im)politeness expressions.
  26. Test empirical predictions of recent theories of (im)politeness with respect to third-party and projected self-perception.
  27. Test empirical consequences of theories of gender differences in language use (for example, see here).
  28. Analyze proxy measures of mutual understanding in dialogue (see here, or here, or here, etc.).
  29. Examine parameters that influence perception and choice in the ultimatum game (for example, see here).
  30. Topics in collaboration with Dr. Maria Koutsombogera: Analysis and modelling of multimodal and multiparty interactions. The projects will exploit a newly created corpus of multimodal interactions between three participants. The objective of the projects is to address some of the challenges in developing intelligent collaborative systems and agents that are able to hold a natural conversation with human users. A starting point in dealing with these challenges is the analysis and modelling of human-human interactions. The projects consist in the analysis of the low-level signals of speakers (e.g. gaze, head pose, gestures, speech), as well as the perception and inference of high-level features, such as the speakers' attention, the level of engagement in the discussion, and their conversational strategies. Some examples of similar work are documented here and here. Indicative literature is available here. Samples of other existing corpora will also be made available to interested parties.
    1. Prediction of the next speaker in multiparty interactions based on multimodal information provided by the participants' (a) gaze, (b) head turn/pose, (c) mouth opening and (d) verbal content.
    2. Measuring participants' conversational dominance in multiparty interactions by exploring (a) turn length, (b) speech duration, (c) interruptions (d) feedback responses and (d) non-verbal signals (mouth opening, gaze, etc.)
    3. Create a successful attentive listener: investigate and decide upon the features that constitute an active listener, based on the analysis of feedback responses, as well as their frequency, duration, and intensity.
    4. Prediction of success in collaborative task-based interactions: investigate the factors on which the perception of the success on a task depends. This will involve a series of perception tests examining the team role of the speakers and their conversational behavior.
  31. Topics in collaboration with Dr. Erwan Moreau: supervised and unsupervised methods for author verification and related application.
    The author verification problem consists in identifying whether two texts A and B (or two groups of texts) have been written by the same person. This task is the keystone of authorship-related questions, and has a range of applications (e.g. forensics). This problem can be addressed in a number of different ways, in particular in a supervised or unsupervised setting: in the former case, an annotated set of cases is provided (each case is a pair of texts A and B, provided with "yes" or "no" depending on whether A=B); in the latter case, no answer is provided.
    Given the availability of several datasets as well as a state of the art authorship software system, the project consists in exploring a certain aspect or application of the topic, for example:
    1. What makes a case more difficult to answer than another? The task would be to study this question through experiments, and then implement a method to predict the level of difficulty of a given case.
    2. Design and implementation of a web interface around the authorship system, possibly presented as some kind of game with text.
    3. While ML systems can be good at giving the right answer, they are not always able to give a human-understandable explanation of the result. The task would consist in studying how to explain the results of some of the methods.
    4. It is harder to answer the question of authorship verification across genres (e.g. by comparing an email and a research paper). One way to improve the system in this case is to distinguish the features which are related to the author from those which are related to the genre.
  32. Social media analytics open up a range of possibilities. Some are in the development of systems that support analysts without a computing background in scraping data that is visible to the general public and record the (potentially multi-modal) data with indexing supported by appropriate meta-data (e.g. location, date, provenance, etc.). Other possibilities involve data analytics in relation to social media content (but presuppose that appropriate data sources are available). A range of projects in this space in collaboration with Emma Zara O'Brien may address the topic of body image and attitudes in relation to demographic variables, such as age.
  33. Other topics to appear.
  34. Still other topics to be agreed upon individually.

Last Modified: Mon Sep 3 15:03:39 2018 (vogel)


Last updated 21 September 2018 by .