Thursday, May 19, 2011

Bigger Screen

I now have the app mirroring on an external screen/projector, which will make demoing that much easier. It has been tested on my desktop screen and the projectors at school. Check it out:


Monday, May 16, 2011

The Going is Good

A lot of progress has been made with the project over the past couple weeks. The app is actually "working," meaning that the user interface can interact with the server/vision algorithms and can correctly identify a bird species. A break down of the stuff working:
  • The user can select an image from their photo album, or they can use the camera to take an image
  • The user can submit the image to the server, which kicks off a new session and starts the question-answer cycle
  • The user can answer a multiple choice question and submit their answer
  • The user can answer a part click question and submit their answer
  • The user can change the certainty of their answer ("Definitely," "Probably," "Not Visible")
Its nice to have this stuff working, and fairly well tested, but there is still a lot of work to be done. A list of stuff that needs to be finished:
  • The user should be able to view more information about a particular answer
  • The user should be able to view more information about a particular species solution
  • The user should have some way of "finishing" their session (either the problem was solved or not)
  • Several "specialty" features need to be implemented, including letting a user remove a bird species from the list of results, and to be able to look "under the hood" and check out "part heat readings"
Here are some screen shots showing the current progress:

Home Screen (not too special yet...)

Choosing an image from the library:

Image to be submitted to the server:

Multiple Choice Question View:

Certainty Before Submitting:

Part Click Question View:

Wednesday, May 4, 2011

Visipedia App, Take 2

So over the past couple weeks I have learned quite a bit about programming for iOS devices. I have learned enough in fact to realize that my first crack at the Visipedia app will not suffice. So, starting last week, I scraped everything and restarted with a clean slate.
The biggest changes on this second go around have been the way that I utilize the profiling tools during development. In order to prevent a massive buildup of allocation and memory issues (most of them subtle issues), I profile often and repeatedly. Redesigning the app from the ground up has also let me re-tweak some design choices, especially on the networking side, that have really paid off. I have also been able to employ some better design patterns now that I have a better idea of what requirements I have to meet. Basically, Visipedia App 2.0 is much better than Visipedia App 1.0.

Images of the updated interface:


Wednesday, April 27, 2011

Always Clean Up After Yourself....


So the networking side of the application is just about finished. I can now cycle through all of the multiple choice questions and display the question contents on the device. I can also view a list of random solutions. The past couple days I have been becoming very familiar with the profiling and analyzing tools offered in XCode. I have been using these tools to find and fix leaks and allocation problems. I have just about everything cleaned up except for one last allocation problem. Somewhere I am not releasing some of the data that I pull off of the server, so as more and more requests are sent to the server I have a slow increase in memory usage. After approximately 75-100 requests this problem becomes very relevant. So I after I find this bug I will have a pretty stable foundation to build the rest of the UI on.

This is about the half way point in the quarter, and I think I am right on schedule. Getting the networking code finished is a pretty big milestone, and with it complete I can focus more on the UI design. I am hoping to have a "testable" user interfacer created in the next two weeks. I can then take my iPad to a coffee shop and ask people to give me feedback on the design.

Here is a screen shot of what I have been working with recently:

Friday, April 15, 2011

The Goods Have Arrived


The project become much more "realistic" this week with the addition of an iPad2 to do testing on. In reality I will still be using the XCode Simulator quite frequently because a lot of the tasks don't require me to actually interact with the device. However, it will be super sweet to do UI design tests with a real iPad. Using the mouse to scroll the iPad Simulator just is not nearly as fun as it is to do with your finger.

Up to now I have just been looking at videos or blogs of iPad apps in order to get a feel for which user interfaces work and which ones don't. I have also been to the computer store on campus a couple times to try out the iPads that they have on display. Having an iPad to work with will make the process of UI testing much easier, and will hopefully enable me to figure out the type of design that I want in a much shorter time. I will also be able to let other people test drive the application, a task that could have been done with the Simulator but would have been really unintuitive for a lot of people. Collaboration on the UI design will help me examine the way people interact with the question-response interface, which could lead to very interesting results. I am predicting that it will be difficult to get people to stop answering questions and look at the solutions, or on the flip side, it will be difficult to get people to stop "flicking" through the solutions and answer a question. Either way, it will be interesting to see what happens.

My first impressions of the device are all positive. This evening I coded up some abstract image classes that I will use for all my image objects (images that go along with questions, images for the solutions, etc.). The images are all responsible for fetching themselves from the Visipedia server, so I am really starting to exercise the networking code that I have written over the past couple weeks. I wrote up a quick program that filled a UITableView with a bunch of images, and I was happy to see that the device downloaded the content quickly and appropriately, and the UI stayed responsive. An example of this table can be seen below.

Monday, April 11, 2011

More Networking

The past week has been dedicated to building the networking infrastructure on both the client and the server. For the client, I used Apple's MVCNetworking sample code as my foundation. This code provided some pretty sweet classes for handling the numerous networking tasks that must be executed in order to retrieve data from the server. I dove into the code and modified it to suit my needs.
I will have a "Network Manager" that will be used to handle all networking operations and all non-trivial cpu operations. The "Network Manager" will have several operation queues that will be used for executing the different tasks. A network queue will fire off the http requests and listen for responses, and then direct the data to where it needs to go. A cpu queue will fire off image processing tasks and any other intensive task that requires execution.

The classes that will handle the actually http request also have the ability to handle errors that may arise. I don't do anything too special other than wait a specified amount of time before sending the request again, unless the error was fatal. Since this application is really geared towards a demo, instead of customers, I am not too worried about handling all the corner cases of networking problems.

I have also created some classes to drive the "Network Manager." These classes will put operations on the "Network Manager's" operation queues, and will wait for these operations to finish. For example, when a user answers a question, a request will be sent to the Visipedia server, and a response will be sent back that dictates the next question to ask, along with any other information associated with that question (like images, text, ...). More requests will be sent out to obtain the associated information. By this time, the view on the device will have changed, and hopefully all the requests will have completed so that the images and text for the new question can be displayed. Default placeholders will be used for images that are either still downloading when the view changed, or if an error occurred.

I am actually pretty happy with my current client networking classes. The server will just use some hacked together php scripts until I have finalized the client-server protocols. This week, I will figure out what to do with all the information that I am requesting from the server. My plan is to use CoreData to store image information (such as: relevant question, image url, image thumbnail, etc...) and to use the file system to store the actual image.

Sunday, April 3, 2011

Week 1 Updates

This past week I continued the process of learning Objective-C and experimenting with iOS development. Specifically, I looked into network solutions that would enable me to easily set up code to communicate with the Visipedia server, and I looked into implementing different kinds of view controllers. I found ASIHTTPRequest, found here, to be super useful for the networking problem. After playing around with the code for a bit I was able to post requests to the server. My "QueryServer" button is the culmination of this effort:


I also dived into the Apple documentation regarding UITableView and UIPopoverController. The UITableView will enable me to present the user with a scrollable list of images that they can swipe through. These images could be answers to questions posed to the user, or they could be the images of the most probable solution to the recognition question. The UIPopoverController is a useful tool that will enable me to provide the user with more information regarding the image that they tapped on in the list of images. The image below displays an example of these uses:
During the next week I will focus mainly on server side code development. Hopefully by the end of the week I will be able to post user images and question responses to the server and retrieve questions, images, and updated probability solutions.

Tuesday, March 29, 2011

Resources

Over the past two weeks I have been studying Objective-C and the iOS frameworks. Stanford University offers some great classes that teach iOS development and luckily they post the video pod casts of the class lectures on iTunes. I used the resources from the Fall 2010 CSE 193p class taught by Professor Hegarty. These resources can be found here: http://www.stanford.edu/class/cs193p/cgi-bin/drupal/ as well as on iTunes.

Even with the inconvenient budget cuts, our library has some awesome electronic resources for learning Objective-C and iOS development. If you are a student at UCSD, or have access to the electronic resources of the Geisel Library, then these resources can be found here: http://libraries.ucsd.edu/. By searching for Objective-C or iOS, many e-resources will be shown. I found the following pretty useful:
  • Objective-C by Jiva DeVoe
  • Cocoa and Objective-C: up and running by Scott Stevenson
  • Advanced iOS 4 Programming: Developing Mobile Applications for Apple iPhone, iPad, and iPod touch by Maher Ali
Most recently I have been watching and looking at the slides of the 2010 WWDC, which can be found here: http://developer.apple.com/videos/wwdc/2010/

Monday, March 28, 2011

Getting Up To Speed

To get people up to speed on Visipedia, I have collected a couple of links that they can check out:


Basically you can imagine Visipedia to be an augmented version of Wikipedia, where you can perform searches with images instead of text. The "poster child" example of Visipedia would be the following:

Imagine you just took a picture of a bird from your back yard or while you were hiking. You would now like to know some more information about the bird in your image. With "current technology" this leaves you with several options.
  • Option 1: Type some stuff into the Google search bar and hope for the best...
  • Option 2: Go to a website like whatbird.com
  • Option 3: Crack open that 10 year old field guide and hope you don't make a mistake when following the instructions, otherwise you will be leafing back and forth until you convince yourself that you took a picture of a penguin in so-cal...
To be fair, "current technology" has gotten us by, but that does not mean we should be happy with the status quo. Enter Visipedia. Instead of typing a text description of the bird into a Google search bar, a user of Visipedia would be able to drag their picture of the bird from their desk top into the search bar, submit their query, and like magic information regarding the bird in their image would come up. No text descriptions, no field guides, nothing but the image.

A lot of work still needs to be done before the above example can become a reality. My project for CSE 190 will be to construct an interface for Visipedia on iOS devices. The main focus will be on the development of a clean, flexible code base that will lay the foundation for a full version of Visipedia. For now, the application will be geared towards the category of birds. An emphasis will be placed on testing different UI designs in order to understand how users will best interact with Visipedia. The Apple mobile devices offer a unique opportunity to allow the user to interact with the recognition algorithms in a very positive and natural manner. This human interaction will greatly assist the recognition algorithms, making them useful well before they are capable of operating without human assistance. Leveraging the touch interface of the devices to make the Visipedia experience fun and intuitive will be a key aspect of this project, one that will hopefully shed light on the future full implementations of Visipedia.

Sunday, March 27, 2011

Hello

This blog will be where I maintain updates during the construction of a Visipedia application on iOS devices.