Saturday, 24 April 2010

Final Proposal

Better Print Support

Project Details

Title: Better Print Support.

Introduction:

'Better Print Support' is a project idea aimed at solving the current limitations that users find in Mapnik when printing maps to PDF documents and other output formats. These limitations are minor, since there is already a strong development inside Mapnik's core for addressing these issues, but important to be considerer for achieving optimal printing quality.

Background:

There are currently two desirable features that members of the Mapnik and related GIS communities have asked for and that would be suitable for implementing over the period of time set aside for Google's Summer of Code: Finer control over scaling and resolution and better post-processable output. The motivation for choosing such features are presented next:

Finer control over scaling and resolution

According to the description of the project, many applications that use Mapnik to offer their services are also interested in having greater control over the rendering process to produce output according to the target of display, be it a computer screen or a digital document that is meant to be printed. One such application is TownGuide [1]. This client application relies on Mapnik for rendering a high quality map in a digital image format, such as PNG and JPEG, which is later processed by a third-party PDF generator to produce PDF output. This application lets the user choose the size of the map, as well as the resolution at which it is printed over the surface of a PDF document.

The problem encountered by the author of this application was that adjusting the resolution at which the map was rendered caused the image to have a loss in quality, evidenced by 'pixelated' road casing lines and blurry edges. The adjustments of resolution were performed in a rather ad hoc manner: To achieve a higher resolution, directives in Mapnik were configured to produce the map in a larger size and resized later to fit a smaller area in the PDF document. This method had the counterproductive effect of displaying the text of streets and cities in small font sizes, since they were rendered by Mapnik according to the size that was originally requested.

Illustrations 1 and 2 show a portion of the same map at different resolutions as produced by the application. As it can be perceived, a map printed at a lower resolution following the previously outlined method exhibits thick road casings with pixelated borders and blurry dotted coast lines. On the other hand, a map with higher resolution displays the map with elements that do not appear in the lower-resolution one and with text that is hard to read.

This problem arises from the fact that resolution is being manipulated outside Mapnik's scope, using external libraries for modifying an image that is not scalable. The problem faced by TownGuide would be solved if Mapnik offered an interface for specifying the resolution at which the map should be rendered. The output quality of the images above would then improve if they were printed at a higher resolution.

Better post-processable output

Vector-format images are more flexible than raster-based ones when it comes to editing them after they have been rendered. A vector image in a common format can be processed by vector editing/drawing applications to meet the special needs of users. In this case, a map in a vector format could be zoomed in and resized without loss of definition in the image. A user could also be able to draw over the map to add important visualization aids and cartographic elements (grids, coordinates, scale bars and suchlike) in an easy and flexible way.

Applications based upon Mapnik that do this kind of processing are TownGuide and MapOSMatic [4], which also produce output in PDF format. Since PDF is also vector-based, the integration of maps in vector format, like SVG, would be seamless and smooth.

The idea:

I have divided this section, as above, into two subsections, each addressing one of the features intended to be developed in this project.

Finer control over scaling and resolution

There are currently two main renderers in Mapnik's core, one based upon the Antigrain Geometry graphics (AGG) library and the other upon Cairo. The AGG renderer has only support for output in raster format (JPEG and PNG), whereas the Cairo renderer exploits the concept of surface [5] (the target on which drawing operations are performed) found in Cairo to define an abstract interface behind which different image/document formats can be implemented for rendering, both raster and vector (PDF, PNG, PostScript, SVG, etc.)

With respect to the raster or bitmap formats, it is widely known that they are prone to loss of quality after applying a transformation. Because of this, scaling needs to be performed before rendering the image through, possibly, the use of a scaling factor that would be multiplied by the original value of elements such as line strokes and fonts [9].

The implementation of such a scaling mechanism is needed when it comes to adjusting the resolution at which the image of a map is required. The goal is to make possible to render the maps at high resolution, so the quality of the image in the PDF document is comparable to that present in other formats. Once the scaling of raster images is implemented, a client application would specify the desired resolution, which would then be expressed as a scaling factor for determining the size of symbols. This has already been tested for ticket #343 and the idea would be to determine the way resolution parameters will be exposed to the user. One of the issues related to this design decision is whether the resolution parameter should be accompanied by units of measurement or should it be expressed as a scalar value inside some predefined range.

Another issue associated is that of variable units. Measurements and quantities could be easier to handle if they were expressed in units other than pixels (mm, cm, pt, etc.) This would require the implementation of functions for converting between each of the new units to pixels. Once the conversion is done, everything would be handled in pixels internally. After the internal processing is done, a last conversion would be applied to display everything that is measurable in the map in the units specified by the user.

The project will then consider:

  • Further research in Mapnik's code base and trac to determine the extent to which scaling has been implemented. Tickets #259 and #343 address a portion of the work that must be done in this area.

  • Addition of an interface that would be exposed to the user for specifying resolution. This interface would be generic enough to be used across different renderers and would be reflected as a parameter added to functions and stylesheets. This parameter would be used internally to scale symbols, line strokes and fonts. With respect to text scaling, research inside the project would determine the best solution for the problem outlined in the 'Background' section of this document.

  • Implementation of an algorithm for scaling raster symbols (markers, placemarks, etc.) Raster symbols will be scaled before rendering, to avoid the pixelation phenomenon that occurs when transforming the rendered image.

  • Addition of an interface for specifying units, both in input and output. This point includes the implementation of a set of functions for converting from user units to pixels. The units considered are cm, mm and in (see ticket #389).

Better post-processable output

A third renderer would be implemented to produce output in SVG format, following the model of agg_renderer and cairo_renderer, which are subclasses of feature_style_processor. This will require the implementation of the methods for processing and rendering each basic symbolizer. For producing the output, Boost::Spirit::Karma would be used to increase performance.

The project will also consider:

  • Learning the Boost::Spirit::Karma library.

  • Design and implementation of an SVG renderer, based on Boost::Spirit::Karma. The use of Karma as a basis follows the need of producing SVG output efficiently [6], since cairo_renderer already produces SVG output, but does so in a rather slow way [7].

Since SVG is a common vector format, popular vector editing applications, like Inkscape and Adobe Illustrator, may be used to edit maps further. One of the features available would be that of defining map layers as SVG groups [10], so it can be possible to hide layers/groups using SVG's visibility attribute [11].

Two of the goals to achieve in Better post-processable output are:

  • To provide a specialized renderer dedicated solely to the production of SVG maps. Cairo renderer already provides vector output in SVG format, but the creation of a separate component to handle SVG would help to improve the overall design.

  • To optimize the production of vector output through the use of specialized output generation libraries.

As it can be observed, Finer control over scaling and resolution has the goal of offering a more flexible interface to the user and is based upon the existing development of the AGG and Cairo renderers. The project would work over these renderers to reinforce and strengthen the services they provide. On the other hand, the implementation of the SVG renderer would be developed almost from the ground up, taking advantage of the models and design of the other two renderers.

In some way, Finer control over scaling and resolution is intended to give support to existing applications that are already based on the current services Mapnik offers, whereas Better post-processable output is planned to offer a more attractive alternative for client applications that may be currently under development or with the possibility to make use of the new tool. This would certainly increase the coverage of mapping and printing needs that Mapnik is currently able to solve.

Goals:

As it can be observed, the project considers several important goals, which can be categorized as belonging to one of the following ends:

  • Finer control over scaling and resolution.

  • Better post-processable output.

In order to make this project achievable, I've decided to prioritize the implementation of the ideas in the order suggested above, that is, the main goal of the project is to offer client applications an interface for specifying the output resolution. The reasons that drive this decision are:

  • There is a project for OpenStreetMap, called 'Easy Printable Maps', that may rely upon the improvements in resolution flexibility [8]. One of the key features provided by this project will be the possibility of printing maps to different paper sizes. In order for the quality of a map printed in large format to be preserved, high resolution images must be produced. This project may require support for variable resolution pretty early in its development.

  • Many client applications that use Mapnik as backend [1] and that don't make use of its vector features would benefit without considerable changes to their code bases.

After achieving this first goal, my focus would be on Better post-processable output. I consider this approach to be beneficial in the following way: Finer control over scaling and resolution will require a deep understanding of Mapnik's architecture, which will in turn require me to experiment with its current capabilities. After gaining experience and getting familiar with the code, I will feel more comfortable to work on Better post-processable output, which, in my opinion, seems to be more demanding.

The above decision also follows my concern for making this project feasible within the time frame given to it.

Project plan (how do you plan to spend your summer?) :

Before May 24 (~ 7 weeks)

The period of time before the summer of code begins, which is of about 7 weeks, will be dedicated to learning Mapnik's designs and code, as well as the most important concepts of cartography. This is a list of topics on which I need to be prepared before May 24:

  • General concepts of cartography and mapping, such as projections and symbols.

  • General concepts used in Geographic Information Systems and mapping applications, such as shapefiles, symbolizers, styling, etc.

  • Use of Cairo and AGG graphics libraries.

  • Use of Mapnik stylesheets, python bindings, cairo_renderer, agg_renderer, pdf_renderer.

  • Use of GDAL utilities.

  • SVG and PDF specifications.

  • Implementation of the examples found in: http://mapnik-utils.googlecode.com/svn/example_code/.

  • Check tickets #343 and #389.

This period of time will let me prepare for dealing with the project when the summer starts. Since I am new to mapping and cartography, it is important that I start learning the concepts and techniques as soon as possible.

It will also be important for me to discuss further my ideas with mentors and members of the community and ask questions to clarify any doubts that may arise. The project should be defined and understood clearly before the summer of code begins. Hence, I plan to be constantly chatting in the IRC and sending emails to the mailing list.

SUMMER OF CODE BEGINS:

Work Methodology:

During the internship I am doing this semester I realized that the best approach for working in a project with which one is new to, is to work on releasing prototypes to demonstrate that the ideas sketched in the design are feasible. The prototypes are later refactored to improve the design and achieve modularity and flexibility. An up front design is always necessary, but it should not take too long to be produced, since it is likely to change when taken to the implementation.

Another practice I adopted was that of writing unit tests before implementing each self-contained piece of code. This practice is common in Test Driven Development (TDD) methodologies and has given me good results. When you write tests before the actual code (just employing the interfaces), you tend to think on the client code that might use the functionality, which leads you to simplify interfaces and break functionality into independent objects and functions.

Finally, the use of source control tools (like git or svn) and a bug tracker (like bugzilla) are both practices that help you keep the complexity of the project under control.

Week #1: May 23th - 29th

In this week I will be setting up my work environment (source control, compiler, unit testing framework, etc.), if I haven't done so. After having the environment ready, I'll proceed to sketch an initial design for the resolution interface and present it to my mentor. After discussing the design and receiving feedback, I'll make the necessary corrections and modifications to next implement a prototype based on the initial design (this may take up to two weeks, however).

The following issues need to be investigated before the initial design of the interface is implemented:

  • How will the specification of resolution be presented to the user?

  • How will the stylesheet change to accommodate the addition of resolution parameters?

  • How will the python bindings be adapted to reflect the interface?

  • Will resolution be specified using units or as a scalar value?

Week #2: May 30th - June 5th

In this week I will be incorporating the suggestions and corrections my mentor tells me about the initial design. These corrections would be reflected in the code of the prototype(s). The initial design, as described later, would be based on the idea of making the interface as abstract as possible to be able to be used by existing renderers, as well as by the future SVG renderer.

Since the interface would be connected to the scaling mechanism, initial thoughts and research would be required (if this had not been possible before). Before implementing the scaling mechanism, the following issues should be clarified:

  • Which elements will be subject to scaling (line strokes, fonts, raster symbols, etc.)?

  • What kind of algorithm is needed for scaling raster symbols before rendering?

  • How will the issues associated with the patch of ticket #343 be solved (thick road casings, for example)?

  • How could the assumption of pixel size in ticket #343 be replaced with a more abstract solution?

After defining the previous issues, I would start with the implementation of the scaling mechanism in elements like line strokes. Fonts would require further research to understand how they are handled. It has been suggested to me to ask Mr. Tom Carden for help in defining the best way to scale elements, raster symbols especially.

Before implementing the prototype, I would write a set of basic unit tests to implement the real functions against them.

Week #3: June 6th - 12th

I will continue with the implementation of the scaling mechanism, now focusing on fonts and raster symbols. I will be also defining the way agg_renderer and cairo_renderer would need to be modified in order to accept the changes caused by the scaling mechanism. My feeling now is that the renderers need not be modified, since the scaling will be done numerically rather than in the image directly. The dimension of the elements would be passed to the renderers, which will process it to render them.

The following issues may arise:

  • Does the renderer manipulate the dimension of the element, or does it just receive it and render accordingly?

  • Is there a way to avoid applying the scaling factor in each of the functions that require it?

  • How is the scaling factor going to affect the different elements? Will fonts need to be scaled in a different proportion (to avoid decreasing its size to much when high resolution is specified)?

Week #4: June 13th - 19th

If the scaling mechanism is ready, I would release it to ask the community for support on testing it. I would be receiving feedback from them to make the appropriate modifications (the community would be involved in each of the previous weeks, since they would be one of the primary sources of information and suggestions, together with the mentor and the documentation available).

In the mean time, my focus would be on the design and research of what is related to unit handling. It is possible that I might have started to deal with units before this week, since the resolution interface certainly requires the definition of the units to be used (or their absence in case a raw scale factor is used instead). Unit handling will affect both inputs and outputs, so I would also be defining the units that will be recognized, as well as conversions between them and pixels, which I assume will be used internally for processing.

Week #5: June 20th - 26th

I will continue with the implementation of unit handling, finishing the definition of their use in inputs, to later start with the definition of the outputs. What I would need to define first is:

  • Which elements displayed on the map will be accompanied by units?

  • How will the user specify the desired units for each of the element, both in inputs and outputs?

  • What are the conversions needed?

Week #6: June 27th - July 3rd

I will leave this week in case I am not able to accomplish the previous goals in time. By this time, I would expect to have the main functionality required for Finer control over scaling and resolution fairly advanced, with only the need for testing and debugging.

Week #7: July 4th - 10th

In this week I would be starting with the design of the second part of the project, Better post-processable output. I have been told that the main contact here would be Mr. Artem Pavlenko. As in the first part, I would start with an initial sketch the design, based strongly on the existing renderers. I expect to have already had time for learning the Boost::Spirit:Karma library; if this is not the case, I would also be studying the library and doing exercises with it.

Much of the work this week will have to be on discussing and asking for information. The scope of the SVG renderer will need to be defined, as well as the interfaces for input and output.

Weeks #8: July 11th - 17th (Mid-Term Evaluation)

I would continue with the implementation of the renderer, while receiving feedback from the evaluation of the previous weeks. I would incorporate any suggestions of improvement or modifications.

The first prototype would need to define the elements, i.e. symbolizers, that will be considered for rendering in SVG format. Other issues such as layer handling would need to be taken into account this week.

Week #9: July 18th – 24th - Week #10: July 25th - 31th - Week #11: August 1st - 7th

These weeks would be dedicated to implement the SVG renderer and review work with the mentors of the project.

Week #12: August 8th - 14th (Firm "Pencils Down" date)

I would be receiving feedback and finishing what I managed to accomplish.

Future ideas / How can your idea be expanded?:

The following paragraphs explain my point of view on how these ideas could be expanded in the future:

Finer control over scaling and resolution

This part of the project has well-defined boundaries that will be covered in the summer.

Better post-processable output

Due to the apparent complexity of this part, the project may be constrained to the implementation of a subset of the SVG specification. Therefore, other people may build upon the foundations set by this project to implement more complex behaviour.

Explain how your SoC task would benefit the Mapnik community and what other projects might benefit directly or by example:

The possibility to gain control over the resolution at which a map is rendered will certainly increase the quality of the output image when printed to PDF documents. Furthermore, some of the issues that the project would cover have been asked for attention by members of the community, a sign of the interest in the project.

With respect to other projects, I have already mentioned the relationship between this project and 'Easy Printable Maps' of OpenStreetMap. TownGuide would also benefit from higher-resolution maps, which may improve the quality of the printed maps this application produces:

Briefly, 'Easy Printable Maps' will be an application that will offer the user a web-based interface for manipulating the rendering of a map, with the intention of customizing and preparing it to be printed. Options such as target paper size and resolution will let the user print a map in small and large format, with the corresponding quality as specified by the indicated resolution. 'Easy Printable Maps' may be able to achieve its goal by using the interface offered by this project.



Experience

Please provide details of previous experience with C++, python:

C++

I learned the foundations of the language through a school project/program for the 'Operating Systems' class developed in the C language. The project had the goal of making use of the Windows and POSIX APIs for making different system calls (to work with threads, files, interprocess communication methods, etc.) Although I did not have the chance to work with the object-oriented concepts introduced by C++, I acquired the basic knowledge of pointer/reference handling and memory management.

Formally, I started using the language in the 'Algorithms' class of the degree program, were I implemented algorithms for sorting sequences, processing text, and analyzing graphs. Since 'algorithms' is the topic in computer science that I am most interested in, I have dedicated much of my spare time to implement other classic algorithms (with C++ as my preferred programming language for doing so).

Finally, the internship I am doing at Continental Automotive this semester has allowed me to practice and learn the language. The project I am working on is about an application for analyzing the results of various kinds of tests performed on hardware.

Python

I learned the language during the 'Computer Graphics' course of my degree program, in which I used PyOpenGL, an OpenGL python binding, to program a simple 3D editor and a flight simulator. In addition to PyOpenGL, Numpy was also introduced to me during this class.

After that course, I began to use Python for projects of subsequent courses: For the 'Programming Languages' class, I chose it to implement a simple video game using PyGame, a Python library for game development. For the 'Computational Intelligence' class, I used it to program some algorithms of the soft computing area, including 'Ant Systems', 'Genetic Algorithms' and 'Simulated Annealing'.

Please provide details of previous experience with the Antigrain Geometry Library, Cairo, or Boost:

Unfortunately, I have no experience in programming with the Antigrain Geometry library or with Cairo, though I have been studying the latter during the past few weeks.

Boost

The application I am working on in my internship this semester is strongly based on the following libraries:

  • Boost::Exception

  • Boost::Foreach

  • Boost::Lexical Cast

  • Boost::Optional

  • Boost::Property Tree

  • Boost::Smart Ptr

  • Boost::String Algo

Please provide details of previous experience with mapping or GIS:

I have little previous experience with mapping and GIS. My knowledge of mapping and cartography is comprised of what I have learned since I first heard about Mapnik in the Google Summer of Code program website and of what I have read in a book called 'Web Mapping: Using Open Source GIS Toolkits' by Tyler Mitchell (available in Google Books).

I must admit that my knowledge in this area is limited, yet I am not completely unfamiliar with Geographic Information Systems: Two years ago, I participated in the development of a web application that processed GPS information from a server that was fed in real time with data sent from a GPS device connected to the computer of a vehicle. The application was responsible of displaying the points in a map of a city that were visited by the vehicle in order to trace routes and calculate travel times and distances. The maps were provided by Google and accessed through the Google Maps API.

Please tell us why you are interested in GIS and open source software:

GIS

GIS attracted my attention because cartography is a discipline founded on mathematics and geometry that actually applies the knowledge, theory and techniques of these sciences for solving a practical problem. The translation of a mathematical problem into a solution modeled by a computer program is also something I have always been very enthusiastic about. Due to the nature of my degree program (technology-oriented and with a strong focus on research), the opportunities to do so in a school project have been few, though. Hence, participating in this project would give me the opportunity to explore such approach further.

Another reason for choosing GIS, besides its mathematical basis, is the fact that much of the emphasis of my degree program is on computer graphics and GIS applications are much about dealing with image manipulation and analysis. Participating in a project of this kind would allow me to exercise and increase the knowledge I have of this area, and interacting with people experienced in the field would certainly benefit my learning.

In reference to Mapnik, my interest in it grew stronger when I saw the quality that could be achieved in the image output and the widespread renown it has among web-based applications offering mapping services.

Open Source

The primary reason why I am interested in open source software is that projects of this kind are unlikely to be found when working for an enterprise. Collaboration is a second reason: People of different cultures and skills contribute to a project to make it successful, helping each other to achieve common goals.

Please tell us why you are interested in your specific coding project:

I consider that this project is just as important as any of the other projects presented in Mapnik's list of ideas, since they are all directed towards increasing the range of functionalities and services offered by this software, and improving the quality and flexibility of the code. However, 'Better Print Support' has a direct impact on other open source projects like OpenStreetMap [12], Walking-Papers [13], and MapOSMatic, whose developers are looking for a way to easily adapt the output to various paper sizes without loss of quality.

This project was suggested to me by Waldemar Quevedo, a student who is applying for a project in OpenStreetMap. The project he is applying for, titled 'Easy Printable Maps', is a potential first user of the features related to resolution. I consider that this relationship between projects is beneficial because some design issues associated to this project need to be defined in a way that is both flexible to be reused across components inside Mapnik and intuitive for the developer that uses Mapnik as a basis for implementing another service. The latter would be a condition that would be continuously tested and reviewed in order to ensure proper application in the other project (Easy Printable Maps).

Do you understand this is a serious commitment, equivalent to a full-time paid summer internship or summer job?

I understand the commitment I make when presenting this proposal. I am also aware that mentors kindly share their time for making this project possible. Therefore, I commit myself to following the rules set by the organization. My intention is to contribute to the Mapnik's project in the best possible way.

References

[1] TownGuide: http://www.townguide.webhop.net/

[2] TownGuide - 100 dpi map: http://www.townguide.webhop.net/output/73/townguide_poster.pdf

[3] TownGuide – 300 dpi map: http://townguide.webhop.net/output/72/townguide_poster.pdf

[4] MapOSMatic: http://maposmatic.org/about/

[5] Cairo Surfaces: http://cairographics.org/manual/cairo-surfaces.html

[6] Karma's Performance: http://boost-spirit.com/home/2010/03/09/integer-to-string-conversion-karma-fastest-again/

[7] Better Print Support discussion at mapnik-devel mailing list: http://www.mail-archive.com/mapnik-devel@lists.berlios.de/msg00645.html

[8] Easy Printable Maps: http://wiki.openstreetmap.org/wiki/GSoC_Project_Ideas_2010#Paper_Output_Projects

[9] Scaling work in Mapnik: http://mapnik.dbsgeo.com/mapnik_logs/2010/04/08/

[10] SVG groups: http://www.w3.org/TR/SVG11/struct.html#Groups

[11] SVG groups as layers: http://groups.google.com/group/mozilla.dev.tech.svg/browse_thread/thread/a0b41bbf4675eede?pli=1

[12] OpenStreetMap: http://www.openstreetmap.org/

[13] Walking-Papers: http://walking-papers.org/

Other References

Mapnik discussions:

http://old.nabble.com/Re:--OSM-dev--Mapnik-output-resolution-td27495133.html

http://www.mail-archive.com/mapnik-devel@lists.berlios.de/msg00646.html

http://www.mail-archive.com/mapnik-devel@lists.berlios.de/msg00649.html

http://www.mail-archive.com/mapnik-devel@lists.berlios.de/msg00140.html


No comments:

Post a Comment