Using Pthreads in Parallel Computing
"Pthread Parallel K-means" [pdf]
K-means is a popular non-hierarchical method for clustering large data sets. The time requirements increase linearly with the size of the data set which make it paticulary suited for extremely large data sets.such as those found in digital libraries. The method was developed by McQueen in 1967. In our project we take a uniprocessor k-means algorithm and implement a parallel k-means algorithm using pthreads. The algorithm we use is from the Normalized Cuts (Ncuts) code base of the Vision Group at UC Berkeley. The parellel implementation demonstrated good speedup and improved performance.
"Active Learning for Video
technical report TJ Watson, August
In this paper, we present the design and implemention of the Video Annotation Library. VAL interfaces with the IBM VideoAnn annotation tool and is designed to aide in the semantic labeling of video and image databases using active learning techniques. A list of confidence values and nearest neighbbor values is maintained for each video segment in the VideoAnn database. A variety of similarity measures are used interchangeably to create hierarchies (or cluster trees) of features of each video. The learning approach proposes sample video segments to the user for annotation and updates the database with the new annotations. It then uses its accumulative knowledge to label the rest of the database, after which it proposes new samples for the user to annotate. The proposed samples are selected by their knowledge gain to the active learner.
"A B-tree For Distributed Data Structures" [ps]
This paper presents the design and implementation of a single node b-tree storage layer for Ninja Distributed Data Structures (Gribble et al.). The implementation follows the finite state machine (FSM) programming model used in DDS and described by the Ninja Project. The key features of the b-tree design are a lock-free and non-blocking implementation. Control-flow based concurrency control (as opposed to lock-based concurrency control) manages concurrency using non-blocking FSMs and queues. Performance measurements show the b-tree scales well and exhibits graceful degradation under heavy loads. The in-and-out-of-core measurements on the DDS Buffer Cache showed the primary contributor to latency was not I/O, but rather computation. The CPU costs for traversing the b-tree index are the dominating costs.
"Wide Area Events Using Distributed Data Structures" ( ppt
) ( PSN
The Distributed Data Structure (DDS) API is a powerful abstraction on which to easily build cluster applications using reusable components. We show how the DDS can support the Publish-Subscribe-Notify class of internet services efficiently. As an example, the "Food in the Woz" application constructs a channel that notifies subscribers when food is available in the Woz lounge. The channel records user-preferences (such as the device that the user wishes to be contacted on) and delivers food events to subscribers after a transducer has recieved events from Ninjamail and a filter has generated events for each user.
"Using Programmable Routers to Protect Intranets"(ppt)
We are facing a trend towards ubiquitous connectivity where users demand access at anytime, anywhere. This has lead to the deployment of public network ports and wireless networks. Current solutions to network access control are inflexible and only provide all-or-nothing access. It is also increasing important to protect Intranet hosts from other mobile and static hosts on the same Intranet, in order to contain damages in the case that a host gets compromised.
We present an architecture that addresses these issues by using a programmable router to provide dynamic fine-grained network access control. The Java-enabled router dynamically generates and enforces access control rules using policies and user profiles as input, reducing administrative overhead. Our modular design integrates well with existing authentication and directory servers, further reducing admininstrative costs. Our prototype is implemented using Nortel's Accelar router and moves users to VLANs with the appropriate access privilege.
Automatic Path Creation
"System Support for Multi-Modal
Information Access and Device Control" [ps]
WMCSA '99, February 1999
Available in the Iceberg release.
A talk on the prototype.
Automatic Path Creation (APC) is a data-flow based framework
for dynamically composing services. The original prototype was used to
implement an interactive voice response (IVR) system for remote
audio/visual devices. Given a set of strongly typed input and
output devices, transcoding operators, and data streams, APC
automatically creates a path from the input device to the output
device. APC involves two steps: choosing the transcoding operators for
the path and instantiating the path. Path instaniation involves taking
a path that has been created and instantiating the operators and
connecting them together.