Categories
English

The Box

Recently I read The Box, a book about the history of container ships. This is book was recommended by Bill Gates — “you won’t look at a cargo ship in quite the same way again after reading this book.”  It indeed changed my view of the shipping industry and here is a summary of my thoughts.

Influence of Containerization

An immediate result of containerization is a sharp decline in international transportation costs, which resulted in an unprecedented globalization process and business paradigm shift.

Globalization is not a new phenomenon — the world economy was already highly integrated in the nineteenth century. However, the globalization caused by containership is quite different because it fundamentally changed the production process itself. 

Containerization significantly reduced the shipping cost among coastal cities between America and East Asia, which has abundant cheap and skilled laborers. As transportation costs decline, manufacturers could outsource their manufacturing overseas. Many American businesses only do research & design in the US and delegate the manufacturing to Original Equipment Manufacturers (OEMs) in East Asia. This new type of industrial paradigm would not be possible without container ships.

As a consequence, geographical disadvantage becomes a more serious problem. Because American consumers live in coastal cities, it no longer makes sense to manufacture in inland cities as the shipping costs by sea routes are so cheap. Doing business in those inland cities becomes much harder because of the overseas competitions. 

In east Asia, coastal cities also absorb all the foreign investment and markets. Guangdong and Jiangxi are two Chinese provinces that are adjacent to each other. However, the GDP per capita of Guangdong is almost twice that of Jiangxi. The reason is only that Guangdong is coastal while Jiangxi is landlocked.

To reduce the gap, inland cities have to invest heavily in transportation infrastructure to reduce the shipping cost, which is very challenging. 

Influence of Deregulations

The U.S. government played an interesting role in the history of containerization. The government regulations initially prohibited corporations to be involved in both land-based and sea-based transportations.

The initial goals of these regulations were to prevent monopoly and to ensure a fair price for consumers. However, the goodwills of lawmakers turned out to be a huge obstacle and made the cooperation among the shipping, railway, and trucking companies very challenging. Railroads and their customers could not negotiate long-term contracts setting rates. Trucks and railcars that had often been forced to return empty were able to be filled in on the return trip. 

Deregulation changed everything. In 1980, Congress freed interstate truckers to carry almost anything almost anywhere at whatever rates they could negotiate. 41,021 contracts were signed within five years and by 1988 U.S. shippers saved nearly one-sixth of their total land freight bill.

The ability to sign long-term contracts gave railroads incentive to adapt containerships. On average, it costs four cents to ship one ton of containerized freight one mile by rail in 1982 and that cost dropped 40 percent over the next six years, adjusted for inflation.

Although containers were supposed to help cargo move seamlessly among trains, trucks, and ships, it took 20 years since Malcolm McLean invented the first container for the industry to achieve the goal. The process could be much faster without government regulation. This interesting case is another example that shows that the government should keep itself away from the market most of the time. Governments are too slow to adjust themselves to the market due to bureaucracy, so the best way is to let the market speak for itself.

 

Categories
English

Tutorial to migrate from Bitbucket to Github

Install mercurial and hg-git

sudo apt-get install mercurial

sudo apt-get install mercurial-git

Note: The version of mercurial should be >= 2.8.

If the default version of mercurial in apt-get is < 2.8. You can install using pip

sudo pip install mercurial –upgrade

You need to create a repo on Github.com

Clone your bitbucket repo

hg clone https://hbhzwj@bitbucket.org/hbhzwj/sadit hg-repo

Convert hg repo to git repo

Hg-Git can also be used to convert a Mercurial repository to Git. You can use a local repository or a remote repository accessed via SSH, HTTP or HTTPS. Use the following commands to convert the repository

$ mkdir git-repo; cd git-repo; git init; cd ..
$ cd  hg-repo
$ hg bookmarks hg
$ hg push ../git-repo

The hg bookmark is necessary to prevent problems as otherwise hg-git pushes to the currently checked out branch confusing Git. This will create a branch named hg in the Git repository. To get the changes in master use the following command (only necessary in the first run, later just use git merge or rebase).

$ cd git-repo
$ git checkout -b master hg

Push the Git repo to Github Server

cd git-repo;

git remote add origin <github-url>;

git push -u origin master;

cd ..;

I also write a script to do this automatically.

 

Categories
English

The Personal Analytics of My Evernotes

Jing Conan Wang

Aug. 09, 2013

I started to use Evernote since around 2011. Recently the number of notes in my Evernote account has surpassed 5000. To celebrate this milestone, I wrote some python scripts to visualize my evernotes.

The easiest way to get the Evernote data out is to use the official clients. Both the windows and the Mac Evernote official client can export data in ENEX format. Unfortunately, the development team of Evernote decides not to develop any linux client in the near future, which makes the data export in linux very hard. It may be possible to get the data out using the cloud APIs. But requesting an API key is too cumbersome for this small project.

Although it is the most convenient way, exporting data in ENEX format still has two pitfalls:

First, ENEX is a customized xml format, and it contains some characters, particularly ‘&nbsp’, that cannot be correctly recognized by lxml module in python. To address it, I wrote a script (XMLTOJson.py) to convert ENEX files to regular JSON files.

Second, ENEX file doesn’t provide information about the notebook a note belongs to. Fortunately, the Windows client provides a command-line program ENScript.exe that can export notebooks separately. I wrote a powershell script (export_evernote.ps1) to export the notebooks into a folder, in which each ENEX file corresponds to a notebook. Again I used XMLTOJson.py to convert the ENEX files in the folder into JSON files.

Each note is associated with two timestamps–1. the ‘Created’ time and 2. the last ‘Updated’ time.

Here is a plot of the number of notes I created in each year. Considering that only 7 months is covered, the total number for 2013 should be around 2000. From the plot, I was the most addicted to Evernote in 2012, during which I created over 2500 notes.

The following plot is the number of evernotes I updated in each year. The number of ‘updated’ notes was high in 2011, whereas the number is decreasing over the past two years. In 2011 and the first half of 2012,  I used a Mac Evernote client in my Macbook pro. After that, I switched to a Thinkpad x230 with Ubuntu 12.04, in which the only usable option is the web application (www.evernote.com). Updating notes using Mac client is much easier than the web application, which may explain my decrease of note updates.

 

The following plot visualizes the number of notes I created in every month. There is a welcome note whose ‘created’ timestamp is Oct. 19, 2009. However, I signed up Evernote in Jan. 7, 2011. Why the ‘created’ time for this welcome note is Oct. 19, 2009? I guess this date is the birthdate of Evernote, and the ‘created’ timestamp was deliberately set as the birthdate of Evernote.

The following figure is the number of ‘updated’ notes in every month. Obviously, I suddenly stopped to update notes from May 2015, which matches the time I switched from Mac to Ubuntu. The updates were somewhat recovered since Mar. 2013, because I began to use Evernote as my GTD engines. I need to update my task lists and checklists.

The following two plots visualize the ‘created’ notes and ‘updated’ notes in week level. The number of ‘created’ and ‘updated’ notes during the 12th-14th weeks of 2012 are soaring.  During this time, I was busily preparing for the application of Google Summer of Code 2012. The efforts paid off, I was finally selected into Google Summer of Code 2012 in which I had a very wonderful experience.

Evernote provides a feature to tag notes. For each note, you can add as many tags as you like. Usually I don’t use this feature manually, but when I clip notes from my mobile phone and RSS reader, which I often do, tags will be added automatically. The following figure shows the time of tagged notes. The x-axis is the ‘created’ time and each y-coordinate corresponds to a tag.

I was heavily addicted to Google Reader (GR) when it was alive. In 2011, I often read GR in my iphone using MobileRSS, which added ‘MobileRSS’ tags when I clipped articles. Later, I was irritated when the app asked me to pay the second time after I upgraded to a newer IOS, ignoring that I had already purchased a pro version. As a result, I switched to Newsfy, which does not add tags automatically. After GR was shutted down in July, I switched to feedly together with many other disappointed GR users.

From 2012, I started to add tags manually. One missing feature in Evernote is to assign importance to each note like gmail. I emulated this feature by tag notes with @TOREAD, @✭ and @✭✭.

In Evernote, a notebook is a collection of individual notes. The following figure shows a stacked area graph of numbers of noted I created in each month. I started to use notebook feature systematically after April 2012. Before April, 2012, I mostly used evernote as an archive of web pages and dump most of notes into one notebook.  ‘Programming’ is one of the first notebooks I created, which was divided into more sophisticated categories later.

The ‘cybersecurity’ notebook, which is related to my research, dominated in May and the beginning of June of 2012. From May, 2012 to Aug 2012, my focus shifted to ‘GSOC’ , ‘Programming’ and ‘Python’, which dued to my participation of Google Summer of Code 2012.

There is a visible gap in Sep, 2012 when I had a vacation to recover from demanding work in the summer (that summer was very busy for me). I am a fan of classical music. Even in this busy summer, there were still a considerably large number of ‘Music’ notes. The ‘GSOC’ notebook disappeared after the google summer of code officially ended at Sep. 2012. Considering the number of created notes, the four notebooks I used the most are ‘Python’, ‘Linux’, ‘Music’, and ‘others’. The ‘others’ notebook was created in June, 2013 to store miscellaneous notes.

The following figure shows the number of ‘updated’ notes in every month for each notebook. An interesting observation is that few notes in the ‘others’ notebook is updated despite its considerable large size. This observation indicates that I rarely review the notes in ‘others’ notebook. This bad habit should be improved in the future.

The following two figures show the number of ‘created’ and ‘updated’ notes in every week for each notebook. For most notebooks, although the number of ‘created’ notes doesn’t change significantly, the number of ‘updated’ notes increased rapidly in the 12th week of 2013. The reason is that I categorized many notes manually in that week.

The blog is inspired by Stephen Wolfram’s blog: The Personal Analytics of My Life.

I fully agree with Mr. Wolfram that personal data is very useful and everyone should log their own life as much as possible. Evernote is a good tool to achieve this goal.

Surely there is more information I can dig out. But the information in this article provides a good starting point.  The source code of this project is available at:

https://github.com/hbhzwj/VizEvernote

If you are interested, you can try to analyze your Evernote data using these codes. Any suggestion, bugfix or improvement is welcomed.

Categories
English

Work Efficiently In the Information Age

Categories
English

Python Program Configuration

Categories
English

Python Code Optimization

Code optimization is very important, especially for dynamic language like python. Recently I finished a python program of Approximate Dynamic Programming. One part of the program was to use dijkstra algorithm to calculate the shortest path distance in the network. Since the network is really large(about 1 million node), the first version of code was really slow which took about 2000 seconds each run. After code optimization(it took me about 6 hours), each run just takes 0.2 second. This is incredible improvement.

I used the  the following tools during the process, which I highly recommend:

  1. python cprofile module  –> this is a builtin profile module
  2. runsnakerun –> this is a viewer of the cProfile output
  3. line_profiler –> this is line by line profiler


The usage of runsnakerun is quite easy, just use the following two commands
$ python -m cProfile -o <outputfilename> <script-name> <options>
$ runsnake <outputfilename>

line_profil\ is a line by line python profiler and its webpage is:
http://packages.python.org/line_profiler/

add profile decorator to the function you want to optimize
$kernprof.py [-l/–line-by-line] script_to_profile.py

$ python -m line_profiler script_to_profile.py.lprof

 

Here are some tips to improve python performance:

1. in operator is really slow when the list is large. You may often use

if a in S:

           somehting

else:

          other thing

When S is very large, thousands, for instance, in operator will become a bottleneck.

 

A good way to optimize this is have list with binary value to indicate whether a element is in the set or not. Then the code will be:

if I[s]:

        do something

else:

        do some other thing

Of course, it will use more memory.

2.   Call the library function as much as possbile.

Most of the library functions are highly optimized, or implemented in C++.  So if you can call library functions, don’t implement it by you self in python. The for loop in python is slow. You can use scipy vectors to accelerate the program. You should try your best to reduce the for loop in python.

3  Don’t check each time for rare case.
you may have list with 100 elemnent, you will receive a index may or may not within range(0, 100). If it is in the range return the corresponding value, otherwise return -1. but 99.99% of the time the index is in the range.  The bad is:
if idx < 1000:
B = A[idx]
else:
B = -1
A better approach is:
try:
B = A[idx]
except IndexError:
B = 1
Be careful that don’t use broad Exception, use specific type of Exception.

 

Categories
English

Digest of Henry Ford’s My Work and My Life

This is the reading digest of Henry Ford’s <My life and My work>. Henry Ford is among the group of able people who completely change the transportation of human beings and consequently shape our society.  He is a business man with great vision of the society. The real business leaders are always visionary and think big, that’s the reason why they are charismatic.

The Government is a servant and never should be anything but a servant.

Note: The government is not necessarily to be brilliant or productive by itself. The duty of Government is to make sure the most brilliant people in the society, no matter whether they are scientists, musician, entrepreneurs, or others, can have a safe and inspiring environment to use their brilliance.

Business men believed that you could do anything by “financing” it. If it did not go through on the first financing then the idea was to “refinance.” The process of “refinancing” was simply the game of sending good money after bad. In the majority of cases the need of refinancing arises from bad management, and the effect of refinancing is simply to pay the poor managers to keep up their bad management a little longer. It is merely a postponement of the day of judgment. This makeshift of refinancing is a device of speculative financiers. Their money is no good to them unless they can connect it up with a place where real work is being done, and that they cannot do unless, somehow, that place is poorly managed. Thus, the speculative financiers delude themselves that they are putting their money out to use. They are not; they are putting it out to waste.

Note: It is worse for the “good” money to be put into “bad” business than the “bad” money to be put into “good” business. By putting your money into a poor-managed company, you are actually endorsing the low-efficient production unit, which takes resources that should be available for more efficient business. The right approach to the problem of business to change its management, or simply shut it down and start over if management reform is not possible, but not to extend its life by burning more money.

George B. Selden, a patent attorney, filed an application as far back as 1879 for a patent the object of which was stated to be “The production of a safe, simple, and cheap road locomotive, light in weight, easy to control, possessed of sufficient power to overcome an ordinary inclination.” This application was kept alive in the Patent Office, by methods which are perfectly legal, until 1895, when the patent was granted.

Note: Any law can be misused, including the patent law. One thing need to be remembered, a good business needs patent to protect itself, not to attack others.

I believe that there is very little occasion for charity in this world–that is, charity in the sense of making gifts. Most certainly business and charity cannot be combined; the purpose of a factory is to produce, and it ill serves the community in general unless it does produce to the utmost of its capacity.

Note: Charity itself should not be abused. 99% of the charity in the world is just to satisfy people’s need of superiority, and it encourages nothing but laziness. The real charity is to provide the essential opportunity for poor people to earn their life by themselves.

The habit of failure is purely mental and is the mother of fear. This habit gets itself fixed on men because they lack vision. They start out to do something that reaches from A to Z. At A they fail, at B they stumble, and at C they meet with what seems to be an insuperable difficulty. They then cry “Beaten” and throw the whole task down. They have not even given themselves a chance really to fail; they have not given their vision a chance to be proved or disproved. They have simply let themselves be beaten by the natural difficulties that attend every kind of effort.

Note: It is just a matter of habit.

A country becomes great when, by the wise development of its resources and the skill of its people, property is widely and fairly distributed.

Note: The problem of China is not ideology. We don’t care about whether China is socialism or capitalism. What we care is whether people have the opportunity equality in the sense that the self-motivated people who want to earn their lives can get the resource they should have.

Perhaps no word is more overworked nowadays than the word “democracy,” and those who shout loudest about it, I think, as a rule, want it

Note: Look at Chairman Mao. Simply don’t trust politicians who shout loudest for democracy

The workingman himself must be on guard against some very dangerous notions–dangerous to himself and to the welfare of the country. It is sometimes said that the less a worker does, the more jobs he creates for other men. This fallacy assumes that idleness is creative. Idleness never created a job. It creates only burdens. The industrious man never runs his fellow worker out of a job; indeed, it is the industrious man who is the partner of the industrious manager–who creates more and more business and therefore more and more

Note: idleness create no value

The public was paying, and business was booming, and management didn’t care a

Note: 公司人数的快速增长并不代表公司的快速增长。The growth of payrolls does not necessarily mean the growth of the business. 我不明白为什么有的公司领袖会夸耀自己公司的人数规模,在我看来,创造了同样价值但是雇佣了更多的人并不值得夸耀。这不是创造了就业,这是降低了社会的效率。

An employer may be unfit for his job, just as a man at the lathe may be

Note: In the case that boss is unfit for his position, there are two options to increase the welfare of the society. The first option is to fire the boss and the second option is to fail the business. If the boss of the business is fired, there is some possibility that the business will recover, however the possibility is really really small, if not none. A more common case is that a business will be kicked out and a new company will replace it. The company is dead, long live the market. 我们总有一种情怀觉得要让给公司越长越好,公司寿命长对于这个社会并不是一定有益处的。公司的快速迭代是更好的让社会前进的选择。

I pity the poor fellow who is so soft and flabby that he must always have “an atmosphere of good feeling” around him before he can do his work. There are such men. And in the end, unless they obtain enough mental and moral hardiness to lift them out of their soft reliance on “feeling,” they are failures. Not only are they business failures; they are character failures also; it is as if their bones never attained a sufficient degree of hardness to enable them to stand on their own feet.

Note: I cannot agree more. If you can work only when you feel good, you are doomed to be a loser. Edit

Categories
English

Google Summer of Code 2012 !

I am thrilled that my project of “network malware simulation” has been accepted by Google Summer of Code 2012!  I will work on this project from May to August, and a new interesting open source software will be born in this summer 🙂

click here for my proposal

http://www.google-melange.com/gsoc/proposal/review/google/gsoc2012/jingconanwang/1

Categories
English

paradigm of problem solving

In  the 19th century, engineering was almost a synonym of mechanical engineering. A typical way to solve problem at that time was analyzing the mechanical structure, building the machine with gears and wrench and letting steam engine to drive the machine.

Steam engine is a revolutionary innovation as it introduced a “paradigm” for problem solving. If you have a mechanic description of the solution for your problem, steam engine can take care of the rest. There were tons of derivative innovation following this “paradigm”, such as plane, steamship, submarine, automobile and so on. (some of them used internal combustion engine instead, an improvement of steam engine). Theoretically steam engine and its derivatives can solve any mechanical problem, what people need to do is to describe the problem in mechanical language.

Nowadays people have been accustomed to resort to computers when they have problems. Computer is another revolutionary innovation because it provides a similar paradigm. The paradigm for computer is to formulate a arithmetic model, propose an algorithm correspondingly and run the algorithm with computers. A computer is basically a calculator that can do addition very fast. Since all arithmetic operations can reduce to to additions, theoretically computer provides us the ability to solve any arithmetic problems, what people need to is just to describe the problem in arithmetic language.

Despite their different appearance,  steam engine and computer share some common points. Both of them provide a good solution for a fundamental problem, steam engine deals with the problem of “generate rotary movement with strong force” and computer deals with the problem of “do addition with fast speed”.

When a problem can be divided into a large sequence of fundamental problems, we can get the solution of this problem by solving those fundamental problems. With this in mind we can easily conclude there are a lot of other paradigms except for those described above. The figure shows the paradigm for steam engine and computer.

Why paradigm?

The reason why we need paradigm is that paradigm can save us time. The real world is too complex and we cannot do everything well. A reasonable way is to solve a small set of problems perfectly and transform the problem we want to solve into these problems.

Limitations of Paradigm

Every paradigm has its own limitation, which is determined by the two factors. The first one is the fundamental problem itself, you cannot use steam engine to read this blog since the transmission of bits cannot be modeled as a sequence of physical movement.

The second factor is our ability to solve fundamental problems. Doubtlessly computer is the most awesome invention we human beings have ever had. However, there are still many things that computers cannot do in spite of the efforts in the past 50 years. They cannot understand imprecise information, they have very limited communication skills, and so on. All these problems belong to the field of artificial intelligence.  The reason is that although theoretically there exists an arithmetic model for all these problems, the number of arithmetic operations is beyond the ability of any existing computer even for a small problem.

In the dawn era of computers, scientists estimated that it should take only 10-20 years for machines to catch up with humans in terms of intelligence. However, the expected date was missed again and again. There was no big improvement in the recent 30 years and it seems that all researchers are just waiting computers to be stronger and stronger under moore’s law.

However, I strongly believe most of the scientists are in the wrong track. The paradigm of computers cannot provide us a perfect solution for artificial intelligence. Human intelligence cannot be described in arithmetic model with appropriate complexity. We need other types of paradigm to get perfect solution of artificial intelligence.


Jing Conan Wang
wangjing A_T bu.edu

reference:
http://net.educause.edu/ir/library/pdf/ERM0132.pdf
Henry Ford <my life and my work>

image source
http://imspeaking.wordpress.com/category/problem/
http://vector-images.com/clipart/clp13013/

Steam engine makes it possible for machines to move by themselve. Later the internal-combustion engine took the place of steam engine, but steamer was the beginning of this revolution.

Categories
English

Ubuntu in Mac book pro

If you want to install ubuntu in Macbook pro. plese refer the following webpage

https://help.ubuntu.com/community/MacBookPro

be attention that, if you want to have the right resolution, download the driver from the nvidia offical site and don’t use the nvidia-current in apt-get, it will crash the GUI.

if you have installed nvidia-current and cannnot enter the GUI, just delete /etc/X11/xorg.conf and then type startx in the command window. then you will enter a GUI with low resolution interface. Then download the nvidia driver from official site and reinstall again.