RSS FeedDistributed tuning of machine learning algorithms using MapReduce clusters
My review of Ganjisaffar et al’s 2011 LDMTA paper is available at:
http://www.computingreviews.com/browse/browse_reviewers.cfm?reviewer_id=123480
Pico y Placa – Is Washington DC ready to tackle traffic congestion?
Congestion pricing (also known as variable tariff and dynamic pricing) is a mechanism to charge different amounts for traffic at different times. As a simple example, you may have to pay 2$ for using a road at peak hours, and it may cost you only 1$ (or be free) at non-peak hours. Washington DC metro uses the same concept, although they have more than two tiers and have the dreaded peak of the peak charge as well. Airlines and hotels use the same concept, with their prices being computed similarly through a demand and supply mechanism. Generally speaking, congestion pricing is a good mechanism for passenger traffic, compared to appointment scheduling and network capacity management, which is usually used in freight traffic management systems, for example, NX FTMS.
However, congestion pricing (charging more) does not work, if you are not setup to charge anything at all! So a different mechanism is used at times – the one to only allow some vehicles on the roads. This is typically known as road space rationing. For example, you could choose to allow only blue cars on the road at one day, and the red cars on the road another day. Oh wait, you can’t actually do that as that appears to be politically motivated. So, let us try again. You can try to allow the cars that have the last digit of the license plate numbers odd on the road on one day, and the cars for which the last digit is even on another day. Then, you can turn those digits and days around as you wish to make sure that the system is fair and not gameable (and you may need to have some provision for vanity plates).
The best example of this that I have come up is in Bogota, where the system is called pico y placa, which combines the peak traffic hour (pico) also with the license plata (placa), although the peak hours have since 2009 been extended to almost the entire day (6 am to 8 pm). Knowing Bogota’s traffic, I think that is a fair characterization of the peak hour. The way the system works is that cars with number plates ending in 5, 6, 7 or 8 cannot driving between 6 AM and 8 PM on Mondays. Last digits for other days are:
- 9012: Tuesday
- 3456: Wednesday
- 7890: Thursday
- 1234: Friday
There are no restrictions on the weekends.
So, firstly we observe that if you have two cars, one ending in 5 and the other ending in 0, then you have at least one good car every day. So, the people with means try to have two cars with different number plates. For that reason, the Bogota pico y placa changes the digits every year. So, it is theoretically possible that your good combination may not be that good next year. However, based on my understanding of Bogotá pico y placa, the digits are always in contiguous blocks of 4, so if your two license plates end in 0 and 5, then you are perpetually safe. If my understanding is correct, then that is a limitation of the system, and the numbers should be mixed up more thoroughly. To put in perspective though, that limitation is small as only a limited number of people buy two cars to circumvent the system.
Could pico y placa work in DC?
Granted, we would have to call it peak and plate here, but could the concept work? For about 8 months of 2011 I participated in ride share program wherein me and my ride share partner would drive during alternate weeks, to take advantage of the HOV restriction on I-66. The ride share program works as is, and there is even a full slugging system in DC, but if there were more people participating in ride share programs, that would perhaps be easier to find people leaving from and going to same places, and the ride share could really work even better. Participation would of course be almost mandated by peak and plate in DC.
Although WMATA stopped trying to be price-competitive a very long time ago, it is even possible that due to increased ridership, even the metro fares may come down a little. (For a visual on how high metro fares suppresses ridership which raises fares, consider this.)
After a lengthy preamble, here is a question for you:
Could Peak and Plate system work in Washington DC?
- Might work (63%, 290 Votes)
- No chance - people will find ways around (31%, 144 Votes)
- It would work really well! (6%, 26 Votes)
- What's Peak and Plate again? (0%, 3 Votes)
Total Voters: 463
Business Plan – Simple is Sufficient?
Just because a business plan is simple does not guarantee success. Consider this:
Buy chuck e cheese tokens to play their games. Win the tickets from the games and buy soft toys using the tickets. Sell soft toys to retailers. Pocket the difference as your profit.
Is that simple enough for you?
UNIX System Administration – Study Guide (Another Semester Draws to a Close)
So as another semester draws to a close, the questions about a “study guide” “mock finals” and such arise again. This guide is a brief outline in that regard for CSCI 4418 (UNIX System Administration) – the undergraduate + graduate computer class that I taught at GWU during Spring 2012. This is not an exhaustive guide, and there are almost always some questions in the final which are not touched upon here – the lectures and the class discussions are a superset of this document. Students should feel free to discuss these topics and questions amongst yourselves, but I do not provide answers to these. And the answers change every year anyway.
Topics
Scripting and Shell, Access Control, Process management, File system, User management, Software management, Setting up Cron, Backup – Dump and Restore, Drivers, Devices and Kernel, Networking, Routing and DNS, Web Applications, Performance Monitoring
Broad Questions
There is really only one basic question that I want to ask at the end of my class, and that question is the same, at the end of every class, and the end of every semester (or a weekend, if I happen to be teaching a weekend seminar class). That question is: How has this class changed your outlook on the subject matter covered, and what are some of the things that you may do differently now, with the benefit of this class.
Since that question has the abstractness of Eliot’s poetry in it, sometimes I use other questions in it’s stead. These are some examples of questions that are intentionally vague. Focus is on checking how broad you think. Real life problems are usually this vague. In each of these, the setting is that you are the system administrator for the IT system (the set of servers, etc,) for that organization.
- A user complains that another user has been able to read his private files. What measures would you take to prevent this from happening again.
- Your applications team wants to build and deploy a web application that will allow the employees to see how much vacation time they have left. What are some of the security considerations you will check/ensure/discuss with them.
- Some of the servers seem to slow down suddenly at different times. What debugging would you perform to check the server performance, see who & what is causing the slowness, and then to possibly improve it?
- You have many users who are business analysts, and they have secretly confided in you that their definition of computer usage involves Facebook more than it involves UNIX/Linux. When they log into a UNIX server that you manage, they go through a list of items: set system color settings so they can read the system outputs, connect to an Oracle database, run some reports, download those reports, etc. They would like to not have to do all these repetitive work. The reports that they run can change from day to day, but the rest of it is pretty much the same. How can you help them get through the repetitive work?
- Your users occasionally delete some files and ask you to “restore” them. How can you create a scheme or a framework for preparing for these kinds of requests? You may not be able to prevent data loss completely, but your goal is to create a framework for “reasonable” backups. For example, if the total size for files for all users is 20 TB, approximately how much space are you allocating for backups?
- Your users would like to share some files. What options will you consider?
- Your application team is using a database which has many reads and writes. They have heard that RAIDing is a “good idea”. What are some of the RAID choices that you may consider, and what are their advantages and disadvantages?
- When might you need to write a device driver?
Scripting Questions (1 Example)
Write a script that counts all unique words that appear in any of *.txt files that exist anywhere inside the folder /home/courses/cs4418. For example, suppose these are the two files:
/home/courses/cs4418/a/b/c/d/e/f/abc.txt: {Justin Bieber is a great singer.}
/home/courses/cs4418/de/f/xyz.txt: {Lady Gaga is a superb singer.}
Then, your script should print the answer that there are a total of 9 unique words in 2 files.
Objective Questions
Objective questions for this question typically test the student’s knowledge of specific commands, and specific concepts. For example:
- How many times would a given cron job run between two given dates.
- ______ command does x.
- Command y does ____.
- What is kernel space?
- What are two broad kernel architectures?
Apps