Friday, October 31, 2014

PDF in a Net, with Netius, a pure Python network library

By Vasudev Ram


I came across Netius, a pure Python network library, recently.

Excerpt from the Netius home page:

[ Netius is a Python network library that can be used for the rapid creation of asynchronous non-blocking servers and clients. It has no dependencies, it's cross-platform, and brings some sample netius-powered servers out of the box, namely a production-ready WSGI server. ]

Note: They mention some limitations of the async feature. Check the Netius home page for more on that.

To try out netius a little (not the async features, yet), I modified their example WSGI server program to serve a PDF of some hard-coded text, generated by xtopdf, my PDF creation library / toolkit.

The server, netius_pdf_server.py, running on port 8080, generates and writes to disk, a PDF of some text, and then reads back that PDF from disk, and serves it to the client.

The client, netius_pdf_client.py, uses the requests Python HTTP library to make a request to that server, gets the PDF file in the response, and writes it to disk.

Note: this is proof-of-concept code, without much error handling or refinement. But I did run it and it worked.

Here is the code for the server:
# test_netius_wsgi_server.py 

import time
from PDFWriter import PDFWriter
import netius.servers

def get_pdf():
    pw = PDFWriter('hello-from-netius.pdf')
    pw.setFont('Courier', 10)
    pw.setHeader('PDF generated by xtopdf, a PDF library for Python')
    pw.setFooter('Using netius Python network library, at {}'.format(time.ctime()))
    pw.writeLine('Hello world! This is a test PDF served by Netius, ')
    pw.writeLine('a Python networking library; PDF created with the help ')
    pw.writeLine('of xtopdf, a Python library for PDF creation.')
    pw.close()
    pdf_fil = open('hello-from-netius.pdf', 'rb')
    pdf_str = pdf_fil.read()
    pdf_len = len(pdf_str)
    pdf_fil.close()
    return pdf_len, pdf_str

def app(environ, start_response):
    status = "200 OK"
    content_len, contents = get_pdf()
    headers = (
        ("Content-Length", content_len),
        ("Content-type", "application/pdf"),
        ("Connection", "keep-alive")
    )
    start_response(status, headers)
    yield contents

server = netius.servers.WSGIServer(app = app)
server.serve(port = 8080)
In my next post, I'll show the code for the client, and the output.

You may also like to see my earlier posts on similar lines, about generating and serving PDF content using other Python web frameworks:

PDF in a Bottle , PDF in a Flask and PDF in a CherryPy.

The image at the top of this post is of Chinese fishing nets, a tourist attraction found in Kochi (formerly called Cochin), Kerala.

- Enjoy.

- Vasudev Ram - Dancing Bison Enterprises

Signup for email about new products from me.

Contact Page

Wednesday, October 29, 2014

LowFundWala, startup to create videos for startups

By Vasudev Ram

LowFundWala is a recent Indian startup that creates videos for other startups (and also for established companies). They call themselves a startup for startups.

I read about it a newspaper. The concept seems interesting and useful. They also claim to have got some early traction, a.k.a. customers and revenue (unlike the wannabe Zuck type of startup :)

I read in the newspaper that they do videos ranging from low priced to high priced ones.

Promotion of startups' products is one of the applications of the videos they make. They are also supposed to me somewhat of an end-to-end outfit, in that they do everything related to the video themselves, instead of getting parts of the work done by others.

LowFundWala may be worth checking out for Indian or other startups that want to get some videos made for such purposes.

Here is an article about LowFundWala on indianweb2.com.

- Vasudev Ram - Dancing Bison Enterprises

Click here for email about new products from Vasudev Ram.

Contact Page

Monday, October 27, 2014

Pulley, site to sell your downloadable stuff

By Vasudev Ram


Pulley (pulleyapp.com) is a site that allows you to sell your downloadable stuff of any kind. It seems to be something like Gumroad.

I got to know about Pulley via an email newsletter that I get.

From the home page of the Pulley site:

[ Pulley is a simple way to sell your digital art, music, videos, photography, fonts, eBooks, software, and other downloadable products. ]

They have a 14-day free trial (*). Their plans start from $6 per month and can be seen here.

(*) To use the free trial, you have to sign up for one of the paid trials, and then cancel withing 14 days if you don't want to pay and continue using the service. The site says that no credit card is required to sign up.

- Vasudev Ram - Dancing Bison Enterprises

Click here to signup for email about new products from Vasudev Ram.

Contact Page


Friday, October 24, 2014

Print selected text pages to PDF with Python, selpg and xtopdf on Linux

By Vasudev Ram



In a recent blog post, titled My IBM developerWorks article, I talked about a tutorial that I had written for IBM developerWorks a while ago. The tutorial showed some of the recommended techniques and practices to follow when writing a Linux command-line utility that is intended for production use, and how to write it in such a way that it can easily cooperate with existing UNIX command-line tools, when used in a UNIX command pipeline.

This ability of properly written command-line tools to cooperate with each other when used in a pipeline, is, as I said in that IBM article, one of the keys to the power of Linux (and UNIX) as a development environment. (See the classic book The UNIX Programming Environment, for much more on this topic.)

The utility I wrote and discussed (in that IBM article), called selpg (for SELect PaGes), allows the user to select a specified range of pages from a text file. At the end of the aforementioned blog post, I had said that I would show some practical uses of the selpg utility later. I describe one such use case below, involving a combination of selpg and my xtopdf toolkit), which is a Python library for PDF creation.

(The xtopdf toolkit contains a PDF creation library, and also includes some sample applications that show how to use the library to create PDF output in various ways, and from various input sources, which is why I tend to call xtopdf a toolkit instead of just a library.

I had written one such application of xtopdf a while ago, called StdinToPDF(.py) (for standard input to PDF). I blogged about it at the time, here:

[xtopdf] PDFWriter can create PDF from standard input. (PDFWriter is a module of xtopdf, which provides the core PDF creation functionality.)

The selpg utility can be used with StdinToPDF, in a pipeline, to select a range of pages (by starting and ending page numbers) from a (possibly large) text file, and write only those selected pages to a PDF file. Here is an example of how to do that:

First, build the selpg utility from source, for your Linux OS. selpg is only meant to work on Linux, since it uses some Linux C standard library functions, such as from stdio.h, and popen(); but you can try to run it on Windows (at your own risk), since Windows does have (had?) a POSIX subsystem, from Windows NT onward. I have used it in the past. (Update: I checked - according to this section of the Wikipedia article about POSIX, Windows may have had POSIX support only from Windows NT up to Windows 2000.) Anyway, to build selpg on Linux, follow the steps below (the $ sign is the shell prompt and not to be typed):

1. Download the source code from the sources section of the selpg project repository on Bitbucket.

Download all of these files: makefile, mk, selpg.c and showsyserr.c .

2. Make the (shell script) file mk executable, with the command:
$ chmod u+x mk
3. Then run the file mk, with:
$ ./mk
That will run the makefile that builds the selpg executable using the C compiler on your Linux box. The C compiler (invoked as cc or gcc) is installed on most mainstream Linux distributions. If it is not, you will need to install it from the repository for your Linux distribution. Sometimes only a minimal version of a C compiler is installed, which is only enough to (re)compile the kernel after making kernel parameter changes, such as for performance tuning. Consult your local Linux expert for help if such is the case.

3. Now make the file selpg executable, with the command:
$ chmod u+x selpg
4. (Optional) You can check the usage of selpg by reading the IBM tutorial article and/or running selpg without any command-line arguments:
$ ./selpg
which will show a usage message.

6. (Optional) You can run selpg a few times with some text file(s) as input, and different values for the -s and -e command-line options, to get a feel for how it works.

Now download xtopdf (which includes StdinToPDF) from here:

xtopdf on Bitbucket.

To install it, follow the steps given in this post:

Guide to installing and using xtopdf, including creating simple PDF e-books

That post was written a while ago, when xtopdf was hosted on SourceForge. So you need to make one change to the instructions given in that guide: instead of downloading xtopdf from SourceForge, as stated in Step 5 of the guide, get it from the xtopdf Bitbucket link I gave above.

(To make xtopdf work, you also have to install ReportLab, which xtopdf depends uses internally; the steps for that are given in my xtopdf installation guide linked above, or you can also look at the instructions in the ReportLab distribution. It is easy, just a couple of steps - download, unzip, configure a setting or two.)

Once you have both selpg and xtopdf installed, you can use selpg and StdinToPDF together. Here is an example run, to select only pages 2 through 4 from an input text file:

I wrote a simple Python program, gen_selpg_test_file,py, to create a text file that can be used to test the selpg and StdinToPDf programs together.

Here is an excerpt of the core logic of gen_selpg_test_file.py, omitting argument and error handling for brevity (I have those in the actual code):

# Generate the test file with the given filename and number of lines of text.
    try:
        out_fil = open(out_filename, "w")
    except IOError as ioe:
        sys.stderr.write("Error: Could not open output file {}.\n".format(out_filename))
        sys.exit(1)
    for line_num in range(1, num_lines + 1):
        line = "Line #" + str(line_num).zfill(10) + "\n"
        out_fil.write(line)
    out_fil.close()
I ran it like this:
$ python gen_selpg_test_file.py selpg_test_file_1000.txt 1000
to generate a text file with 1000 lines, in the file selpg_test_file_1000.txt .

Then I could run the pipeline using selpg and StdinToPDF, as described above:
$ ./selpg -s2 -e4 selpg_test_file_1000.txt | python StdinToPDF.py p2-p4.pdf
This command extracts only the specifed pages (2 to 4) from the input file, and pipes them to StdinToPDF, which converts those pages only, to PDF, in the filename specified at the end of the command.

After doing the above, you can open the file p2_p4.pdf in your favorite PDF reader (Evince is one PDF reader for Linux), to confirm that it contains all (and only) the lines from page 2 to 4 of the input file selpg_test_file_1000.txt (considering 72 lines per page, which is the default that selpg uses).

Read the IBM article to see how that default can be changed - to either another number of lines per page, e.g. 66 or 80 or whatever, or to specify form feeds (ASCII code 12) as the page delimiter. Form feeds are often used as a page delimiter in text file reports generated by programs, when the reports are destined for a printer, since the form feed character causes the printer to advance the print head to the top of the next page/form (that's how the character got its name).

Though this post seemed long, note that a lot it was either background information or instructions on how to build selpg and install xtopdf. Those are both one time jobs. Once those are done, you can select the needed pages from any text file and print them to PDF with a single command-line, as shown in the last command above.

This is useful when you printed the entire file earlier, and some pages didn't print properly because the printer jammed. Just use selpg with xtopdf to print only the needed pages again.



The image above is from the Wikipedia article on Printing, and titled:

Jikji, "Selected Teachings of Buddhist Sages and Son Masters" from Korea, the earliest known book printed with movable metal type, 1377. Bibliothèque Nationale de France, Paris

- Enjoy.

- Vasudev Ram - Dancing Bison Enterprises

Click here to get email about new products from Vasudev Ram.

Contact Page

Thursday, October 23, 2014

Google Inbox launched, successor to Gmail

By Vasudev Ram


Google has launched a new email product called Google Inbox.

Saw this via Hacker News:

A post about Google Inbox on the official Google blog:

An inbox that works for you.

Hacker News thread about Google Inbox.

Google is going to roll out Inbox in stages to various sets of people. If you want to get an invitation to it, you can email them at inbox@google.com. I did it. Once I get invited, if I find Google Inbox useful or interesting, I will write a post about it.

Meanwhile, here are a few features of Google Inbox mentioned in the official Google blog post:

Bundles (of emails) - like categories that they had before in Gmail.

Highlights - key information from important messages.

Reminders, Assists, and Snoozes.

Assist - if you send a reminder to the hardware store, Assist will tell you its number and if it's open.

Snooze lets you snooze away emails and reminders, until a later time or until you reach another place, like your office.

Interestingly, Google seems to have made a somewhat poor choice of name for the product, again (after doing it with "Go" for the Go language), since in both cases, the word is very common and generic ("inbox" and "Go"), so it will be difficult to search for (even using Google, ironically).

Of course, there are workarounds, like using "golang" instead of "Go", and I'm guessing "Google Inbox" instead of just "Inbox", but those won't work as well as having a more unique name. I just did a Google search for the word "inbox", though, and www.google.com/inbox/ was the first hit.

- Vasudev Ram - Dancing Bison Enterprises

Click here to signup for email notifications about new products and services from Vasudev Ram.

Contact Page