Recently, there has been an increased interest in making government data open and easily accessible to the public. Technological advances in the area of semantic web have given rise to the development of the so-called Web of data, which has made an even stronger push to such efforts. The Web of data envisions an extension of the World Wide Web (WWW) that is composed of resources that correspond to real-world entities and are interconnected with links denoting their in-between relation. The concept of the Web of data has been recently incarnated into the research area of linked data that studies how one can make data expressed in the Resource Description Framework (RDF) available on the Web and interconnect it with other data with the aim of increasing its value for everybody. Therefore, semantic web technologies provide a perfect fit for making government data open.
An important kind of government data is the data related to legislation. Legislation applies on every aspect of people's living and evolves continuously building a huge infrastructure of interlinked legal documents. Therefore, it is important for a government to offer services that make legislation easily accessible to the public and specialized professions aiming at informing themselves, defending their rights, citing, or using it as part of their job. Towards this direction, there are already many European Union (EU) countries that have computerized the legislative process by developing platforms for archiving legislation documents in information systems and offering online access to them.
Web standards like XML, XSLT, and RDF facilitate the separation between content, presentation, and metadata, thus contributing to a better exploitation of information present in these documents. A popular format and vocabulary that is used for the encoding of the structure and content of legal and paralegal documents is CEN MetaLex (in the following just MetaLex), which has now been adopted and maintained by the European Committee for Standardization (CEN). MetaLex serves as an XML data exchange format, but recently an ontology counterpart of it has sprung up (http://justinian.leibnizcenter.org/MetaLex/metalexcen.owl) which is expressed in the Web Ontology Language (OWL) and used as a vocabulary for expressing metadata of legal documents.
There are a number of endeavors that have employed MetaLex for organizing and offering national legislation to the public. For example, the MetaLex Document Server (http://doc.metalex.eu/) offers almost all the Dutch national regulations published by the official portal of the Dutch government. These regulations are offered both in MetaLex and as linked data in RDF. The United Kingdom also publishes legislation on its official portal (http://www.legislation.gov.uk/). Legislation is expressed using MetaLex, while metadata information about them is offered in RDF using the MetaLex OWL ontology.
In Greece, there is still very limited degree of computerization around the legislative process and even the discovery of legislation related to a specific topic can be a hard task. A recent law and program, called Di@vgeia (https://diavgeia.gov.gr/), has tried to remedy this picture by obliging government institutions to upload their acts and decisions on the Web. Di@vgeia portal offers basic search functionality using keywords on the content of legal documents and its metadata, as well as a service that provides developers the ability of searching this collection of documents based on some pre-selected metadata information.
In this work, we are following the footsteps of other successful efforts in Europe and aim at modernizing the way the legislative work is offered to the public. In line with the goal of Di@vgeia, we envision a new state of affairs in which people have at their fingertips advanced search capabilities on the content of legislative work. We envision a paradigm of distribution of legislative work in a way that developers can consume, so that it can be also combined with other open data to increase its value in the interest of people. To the best of our knowledge, there is no other effort in Greece that takes this perspective on legislation and related decisions made by government institutions and administration alike.
We are seeing legislation as a collection of legal documents with a standard structure. Legal documents may be linked in terms of modifications by and citations to others, reflecting rich semantic information and interrelationships. A modern country needs intelligent services which not only present the textual contain of legal documents but are able to answer complex questions such as “Which legal documents a certain minister has signed during his service as Finance Minister?”, “Which legal documents have been modified and by whom?”, or “Retrieve the 10 most frequently modified legal documents between 2008-2013”. To enable the formulation and answering of such complex questions, we have designed and developed a prototype web application, called Nomothesi@, which offers Greek legislation in RDF as published in the government gazette. Greek legislation has been modeled in Nomothesi@ according to an OWL ontology that reuses the CEN MetaLex OWL ontology and extends it where needed to capture the peculiarities of Greek legislation. Nomothesi@ offers advanced presentational views and search functionality over the metadata and the textual content of Greek legislation. An important feature of Nomothesi@ is that it offers a SPARQL endpoint and a RESTful service that can be leveraged by developers for consuming its content programmatically and combine it with other open data. In this respect, Nomothesi@ opens a whole new market that can be developed based on semantic web technologies with direct societal benefits and great business opportunities.
Greek legislation is published through different types of documents based on the government members, who curated it due to a specific legislative procedure. It has a standardized structure following the appropriate encoding, which may be reformed according to subsequent modifications.
There are five primary sources of Greek legislation we are considering in this work: constitution, presidential decrees, laws, acts of ministerial cabinet, and ministerial decisions. These sources of legislation are materialized in legal documents, which are encoded following specific standards. Legislation is an event-driven process. Legal documents are published in the government gazette, while they may be modified by later legal documents in terms of content modifications, and finally come out of enforcement. In the course of this process, we need to capture the structural information of legislation and the evolution of its content through time, given by the legislative modifications applied on the primary legal document.
Nowadays, the encoding of Greek legislation follows the rules set out in “Manual Directives for encoding of legislation”, which have been designed by the Central Committee of Encoding Standards and legislated in Law 2003/3133. The encoding of a legal document is organized in a tree hierarchy around the concept of fragments that are articles, paragraphs, cases, or passages. These fragments are described below. Articles are the basic divisions in the text of legal document numbered using Arabic numerals (1, 2, 3, …) or, in the case of insertion of a new article in an existing legal document, by combining Arabic numeral with upper-case Greek letters (A, B, Γ, ...). An article may have a list of paragraphs that are numbered using Arabic numerals. If an article has a single paragraph, the numbering of that paragraph is omitted. Paragraphs may have a list of cases. Cases are numbered using lower-case Greek letters (α, β, γ, ...) and may have sub-cases which are numbered using double lower-case Greek letters (αα, ββ, ...). The verbal period between two dots is termed as passage. Passages are the elementary fragments of legal documents and are written contiguously, i.e., without any line breaks between them. Passages are the building blocks of cases and paragraphs. Last, legal documents may be subdivided according to their size at larger units, such as books, chapters, or sections, which are numbered using upper-case Greek letters. The larger units and articles may have title, which must be general and concise in order to bear their content, and is used in the systematic classification of the substance of legal documents. In addition to the aforementioned structural elements, legal documents are accompanied by metadata information. This includes the title of the legal document, which must be general enough but concise so as to reflect its content, the type (e.g., law, presidential decree), the year of publication, and the number (i.e., the serial number counting from the begging of the year for each type). These last four pieces of metadata information serve also as a unique identifier of the legal document. Of equal importance are also the issue and the sheet number of the Government Gazette in which the legal document is published.
When the reference to other legislation is necessary, this should be done uniformly throughout the text. Specifically, for purposes of accuracy and reading usability, and must bear the number of the legal document and the year of publication. At the first occurrence of the legal document, the issue and the number of the sheet of the Government Gazette must be stated in brackets. It should also be mentioned the fragment thereof, where such reference.
It is common international practice the amendment of a legal document by subsequent legal documents. Unfortunately, given the encoding of legal documents, there is no standard methodology that is followed for the codification of this legislative concept. This makes the whole process of the amendment very challenging from our perspective. By systematic observation, we reached to the conclusion that there are three main types of legislative modifications: 1) the substitution of a specific fragment by another introduced by a subsequent legal document, 2) the insertion of a new fragment and 3) the deletion of a specific fragment. All these kinds of modifications produce new versions of the original legal document. At any time point, the state of a legal document corresponds to the original document reformed by all subsequent modifications applied to it, until the specific time point.
Uniform Resource Identifiers (URIs) are short strings that identify resources in the Web: documents, images, downloadable files, services, electronic mailboxes, and other resources. Such identification enables interaction with representations of the web resource over a network, typically the World Wide Web, using specific protocols. In addition to utilizing the HTTP requests appropriately, resource naming is arguably the most debated and most important concept to grasp when creating an understandable, easily leveraged Web service API. When resources are named well, an API is intuitive and easy to use. Done poorly, that same API can feel klutzy and be difficult to use and understand. Essentially, a RESTFul API ends up being simply a collection of URIs. In our platform, each resource has its own address or URI-every interesting piece of information the platform can provide is exposed as a resource. In other words, the RESTful principal of addressability is covered by the URIs. We have chosen that each resource in a service suite will have at least one URI identifying it. And it's for our benefit when that URI makes sense and adequately describes the resource. URIs should follow a predictable, hierarchical structure to enhance understandability and, therefore, usability: predictable in the sense that they're consistent, hierarchical in the sense that data has structure-relationships.
Fixed URIs to divisions of legislation are very important, as they are on legal documents in general. Various initiatives are trying to upgrade reliable classification for the legislation to existing bibliographic scheme. Their aim is to facilitate the process of creating URIs for legal sources, regardless of the availability of a document on the web, location of a document, and the way to access it. Based on international practice and the particularities of Greek legislation, we proposed a schema of URIs, which is very similar with the UK. In our platform every single legal document, its subparts, its versions or services are resources, which need to be addressed by a specific URIs system. URIs system must be persistent and build so as URI for any kind of resource to be highly guessable. There are a number of different ways one might assign an unequivocal identifier to a legislative document. We have decided to use HTTP URIs. These URIs have been designed following the guidelines of our conception on the matter, but we are hoping that our work will form a new way of describing Greek legislation over the Web.
We would like to present the Schema of URIs one can get in contact with when using our REST services.
Any field between curly brackets needs to take specific value. As a type of legislation, we mean all different types of Greek legislation, we are using the encoding of our platform (e.g. con, law, pd etc.). Year is actually the year of publication (e.g. 2012) and ID is the Number (ID) of the specific legal document. So for example if we want to address in Law 12 of 2014, the corresponding URI is:
An extra goal, and vision at the same time, from this project aims for it to be considered as a RESTful (Representational State Transfer) Web platform and to provide keystone ingredients for other and maybe more specialized applications. We have developed and hoping to continue developing services that allow our API users to retrieve Greek legislation in many forms (PDF, RDF, XML and JSON) from accessing via HTTP GET requests specific URIs (Uniform Resource Identifiers).
We keep on stating, that Nomothesi@ is a RESTful API. One of its goals is to serve a series of Web Services over Greek Legislation to encourage further and more specialized projects. We will describe some of these services, so that we can underline the benefits of working with one single RDF data model through Sesame Server, hoping to inspire other projects to adopt this method, shaping a simpler Semantic Web. In our API one can request any legal document in PDF, XML, RDF, HTML and JSON format, as well as its enacted version or its updated one or even a specific version based on a date. In terms of technical engineering the RDF data model contributed in a very determinant way. We eliminated the unnecessary calls to the database and at the same time with RDF text annotations we managed to separate the text of a legal document based on the request. In this way we don’t only keep the complexity with requests low (one single query to fetch everything and then decide what to use), but also we presenting an expandable model which can very easily adopt more languages in the future.
SPARQL can be used to express queries across diverse data sources, whether the data is stored natively as RDF or viewed as RDF via middleware. SPARQL contains capabilities for querying required and optional graph patterns along with their conjunctions and disjunctions, it also supports extensible value testing and constraining queries by source RDF graph. For the very first time in Greek Legislation and worldwide, our project innovates with the idea of serving a SPARQL Endpoint in a RESTful manner. Users can now take advantage of legal documents not only by reading them, but also querying all the semantic information. This forms a new path in the way legislation is being served and represented by any organization or official institution and we hope that many other will follow our way until we have unified government archives of all types. The main query operations in this system are:
• Obtaining the legislative document valid at a given date or a time space.
• Historical evolution of a legislative document.
• Full text search in the text of the legislative document.
• Laws repealed by a law.
• Obtaining all kinds of Metadata (the model allows new Metadata to be easily included).
• Description queries on the RDF graph.
Some conclusions are, that our architecture is type free which is the key to REST Services. We have one unified request-respond architecture, which differentiates from each type only with the final respond contracting method. That proves once again that RDF data models and Spring MVC are the most suitable tools for a RESTful API like Nomothesi@. We avoided creating long and messy methods for each media type and we ended with an expandable, easy and intelligent sequence of request building methods.