College of Engineering and Computing
Permanent URI for this collection
Browse
Browsing College of Engineering and Computing by Issue Date
Now showing 1 - 20 of 690
Results Per Page
Sort Options
Publication A Behavioral Approach to Worm Detection(2006-08) Ellis, Daniel R.; Ammann, PaulThis dissertation presents a novel approach to the automatic detection of worms using behavioral signatures. A behavioral signature describes aspects of any worm’s behavior that are common across manifestations of the worm and that span its nodes in temporal order. Characteristic patterns of worm behaviors in network traffic include 1) engaging in similar network behaviors from one target machine to the next, 2) tree-like propagation, and 3) changing a server into a client. These behavioral signatures are presented within the context of a general worm model. The most significant contribution of this dissertation is the demonstration th at an accurate and fast worm detection system can be built using the above patterns. Further, I show that the class of worms detectable using these patterns exceeds what has been claimed in the literature and covers a significant portion of the classes of worms. Another contribution is the introduction of a novel paradigm—Network Application Architecture (NAA), which concerns possible ways to distribute network application functionality across a network. Three NAAs are discussed. As an NAA becomes more constrained, worm detection gets easier. It is shown that for some NAAs certain classes of worms can be detected with only one packet. The third significant contribution of this dissertation is the capability to evaluate worm detection systems in an operational environment. This capability can be used by other researchers to evaluate their own or others’ worm detection systems. The claim is that the capability can emulate practically all worms and that it can do so safely, even in an operational enterprise environment.Item Ultra-Fast High-Temperature Microwave Processing of Silicon Carbide and Gallium Nitride(2007-10-10T17:45:30Z) Sundaresan, Siddarth; Sundaresan, SiddarthA novel solid-state microwave annealing technique is developed in this work for post-implantation annealing of SiC and GaN, and for the controlled growth of SiC nanowires. This technique is capable of heating SiC samples to temperatures in excess of 2100 ºC, at ultra-fast temperature ramping rates > 600 ºC/s. Microwave annealing of ion-implantation doped (both p-type and n-type) hexagonal SiC was performed in an uncontrolled (air) ambient, as well as a controlled 100% atmosphere of nitrogen, with or without a protective graphite cap. Microwave annealing was performed in the temperature range of 1500 ºC – 2120 ºC, for durations of 5 s – 60 s. Uncontrolled ambient microwave annealing of SiC at temperatures > 1700 ºC resulted in a significant oxidation of the SiC surface, leading to a loss of the implanted layer. Annealing in a 100% nitrogen atmosphere eliminated the oxidation problem. For microwave annealing at temperatures ≥ 1800 ºC, significant SiC sublimation was observed, even for 15 s annealing. Microwave annealing with a photoresist-converted graphite cap solved this surface sublimation problem for annealing temperatures up to 2100 ºC. For the P+ and Al+-implanted SiC, sheet resistances as low as 14 Ω/ and 1.9 kΩ/ and majority carrier mobilities as high as 100 cm2/Vs and 8.3 cm2/Vs, respectively, were obtained. For the Al+ -implanted SiC, sheet resistances as low as 1.9 kΩ/ and hole mobilties as high as 8.3 cm2/Vs were obtained. These values constitute the best ever reported electrical characteristics for ion-implanted SiC. Microwave annealing at temperatures > 1800 ºC not only removed the implantation-induced lattice damage but also the defects introduced during crystal growth. Microwave annealing of in-situ as well as ion-implantation acceptor doped GaN was performed in the temperature range of 1200 ºC – 1600 ºC, for a duration of 5 s, using different protective caps (AlN, MgO, graphite) for protecting GaN surfaces during annealing. Pulsed-laser deposited AlN was found to protect the GaN surface effectively, for microwave annealing at temperatures as high as 1500 °C. The RMS surface roughness (0.6 nm) of the GaN sample annealed at 1500 °C with an AlN cap is similar to the value (0.3 nm) measured on the as-grown sample with a decrease in the compensating deep donor concentration. Cubic 3C-SiC nanowires were grown by a novel Fe, Ni, Pd, and Pt metal catalystassisted sublimation-sandwich (SS) method. The nanowire growth was performed in a nitrogen atmosphere, in the temperature range of 1650 ºC to 1750 ºC for 40 s durations. The nanowires grow by the vapor-liquid-solid (VLS) mechanism facilitated by metal catalyst islands. The nanowires are 10 μm to 30 μm long with about 52% of them having diameters in the range of 15 nm – 150 nm, whereas 14% of the nanowires had diameters in excess of 300 nm.Item Microfluidic Devices for Forensic DNA Analysis(2007-11-19T16:53:21Z) Shah, Jayna; Shah, JaynaThe development of integrated, miniaturized, and portable DNA analysis systems is crucial to alleviate massive backlog of unanalyzed samples and to address ever increasing demand for these assays. This thesis work contributes towards the development of a fully integrated microdevice capable of "sample in – answer out" for forensic DNA analysis. Specifically, this work describes the development of rapid and robust fabrication protocol for solvent-actuated bonding of polymeric thermoplastic substrates at room temperature, the development of microchannel wall coating strategies to eliminate analyte-wall interactions for high resolution separation of single-stranded DNA, and the characterization of a thin-film planar microwave transmission line for microfluidic heating applications. The solvent-actuated bonding protocol was based on the difference in capillary forces between the microchannel and the interstitial space between the surfaces of the two substrates to be bonded. This force differential wicked the bonding solvent into the gap between the substrates causing them to bond. The technique was implemented by placing the two substrates under moderate pressure, applying a moderate pneumatic vacuum to the fluidic channel, and introducing tens of microliter of bonding solvent through one end of the fluidic channel. The effect of bonding solvent on the dimensions of the microchannel was analyzed, and the mechanical robustness of the bonded devices was also characterized. Electrophoretic separation of single-stranded DNA (ssDNA) was successfully performed to demonstrate the functionality of these devices. To enhance ssDNA separation performance, schemes to modify poly(methyl methacrylate) (PMMA) – the primary substrate used in this work – were explored. This two step process consisted of altering surface hydrophilicity via surface activation using either nitric acid or UV/ozone followed by coating the surfaces with adsorptive polymers. Contact-angle measurements of the pristine and modified PMMA substrates were performed to quantify the change in wettability of the surface. Twofold increase in the separation efficiency was achieved by implementing these surface passivation strategies. Finally, the use of a thin-film planar microwave transmission line as a microwave power source was investigated for on-chip microwave heating of fluids. The microwave characterization data was used to develop a first-order analytic model of the microwave power absorption. The model was used to understand microwave power flow through the device and to calculate the fraction of the incident power absorbed in the fluid. Additionally, a fit of the predicted temperature obtained using this model to the measured temperature was performed to evaluate efficiency of this heating method.Item Perfective and Corrective UML Pattern-based Design Maintenance with Design Constraints for Information Systems(2007-12-12T16:28:05Z) Park, Jaeyong; Park, JaeyongPattern-based design, the use of design pattern during the design process, has become widely used in the object-oriented community because of the reuse benefits that take less cost and effort, but result in high quality in software development and maintenance. However, design pattern defects can be injected in early design without mandatory control of the evolution of a pattern-based design and assessment of pattern-based designs after changes. It is crucial to maintain correct designs during early design maintenance because defects in early design may cause serious damage to software systems in later software development and maintenance. Hence, there is a need of a systematic design method for preventing design pattern defects being injected during pattern-based design maintenance so that the change results of pattern-based designs conform to the corresponding design patterns. Conventional Unified Modeling Language (UML) 2.0 design methods do not provide systematic ways of assessing pattern-based design conformance. Pattern Instance Changes with UML Profiles (PICUP) design method is developed as an improved design method for perfective and corrective UML pattern-based design maintenance and assessment. Design pattern in UML Profiles (DPUP) is developed for formal specification of a design pattern. DPUPs are used for instantiation, maintenance, and assessment of UML pattern-based designs. DPUPs, as the main part of PICUP design method, provide metamodel-level UML design constraints using UML stereotype notations and metamodel-level Object Constraint Language (OCL) design constraints. In this research, assessments of pattern-based designs in UML class diagram with the corresponding DPUPs enforce maintainers to make correct changes of the designs. Pattern-related information is annotated in pattern-based design using stereotype notations. Furthermore, the conformance checking of a given UML pattern-based design can be automated by using the assessment tool. An explanatory two-case study is used to evaluate the effectiveness of PICUP design method with DPUPs, and applied to (1) the Lexi document editor and (2) the ARENA game information system. Questionnaire answers and design pattern defect counts from the two-case study conducted by subject matter experts support the hypothesis that the PICUP method is an improved design method ensuring structural conformance of UML pattern-based designs to the corresponding design patterns during perfective and corrective design maintenance for information systems.Item Method for Deriving Multi-factor Models for Predicting Airport Delays(2007-12-13T18:41:05Z) Xu, Ning; Xu, NingTraffic Flow Manage ament (TFM), in coordination with Airline Operation Centers (AOC), manage the arrival and departure flow of aircraft at the nations airports based on the airport Arrival and Departure rates for each 15 minute segment throughout the day. The management of traffic flow has become so efficient in the U.S., that approximately 95% of the delays now occur at the airports (not airborne). Inefficiencies in the traffic flow occur when non-traffic flow delays (e.g. carrier, turn-around, aircraft swapping and non-terminal area weather) are super-imposed on the traffic flow delays. Researchers have correlated these non-traffic flow delays at airports with sets of causal factors and have created models to predict aggregate delays at airports on the time scale of a day. To be consistent with the way traffic flow is managed, a model of causal factors of delays in 15 minute segments would provide the analytical basis for improving the efficiency of TFM. This dissertation describes the development of multi-factor models for predicting airport delays in 15 minute segments at 34 OEP airports. The models are created using Multivariate Adaptive Regression Splines (MARS). The models, generated using historic individual airport data, exhibit an accuracy of 5.3 minutes for generated delay across all the airports, and 2.1 minutes for absorbed delay across all the airports. A summary of the factors that drive the performance of each airport is provided. The sensitivity of each of the factors is also analyzed. Analysis of the models indicates that the factors that determine Airport Delays in 15 minute segments are unique to each airport. The most significant factors that generate delays at most of the nation's airports are Carrier Delay, GDP Delay at the outbound destination, and Departure Demand Ratio. Because of the relationship between these factors, and the propagation of delays throughout the network, the only way to mitigate system-wide delays is via a holistic network approach. The implications of these results are discussed. The potential benefits from this research include providing: (1) researchers and analysts a method to identify systemic causes of delays in the NAS and study the trends of influential factors; and (2) airlines and Air Traffic managers a means to evaluate predicted delays while executing Traffic Flow Management initiatives.Item A Compiler-Based Approach to Implementing Smart Pointers(2007-12-13T18:41:46Z) Hoskins, Stephen; Hoskins, StephenBecause of the growing popularity of programming languages with garbage collectors, such as C# and Java, there is a clearly a desire for languages that support automated memory management. However, as a result of the inefficiencies of the garbage collectors of C# and Java, there is a requirement that programmers have a better understanding of the underlying implementations of the garbage collectors in order to make applications more robust or so that they can run on real-time systems. Using an implementation of smart pointers written from scratch, this paper attempts to address this problem by exploring techniques that are used by garbage collectors and ultimately concluding which features of object-oriented languages make the task of automating efficient garbage collection more difficult. As a result of the conclusions produced in this paper, it may be possible to create a brand new language with the simplicity and elegance of Java and the robustness and efficiency of C without the developer ever needing to perform memory management.Item Immobilization of Biological Matter using Transparent Metal Electrodes and Silicon Microstructures(2007-12-14T18:28:33Z) Sankaran, Bharat; Sankaran, BharatThis thesis describes the development of two different methods to produce an optimal platform for immobilizing biological matter (cells and proteins). Firstly, transparent indium tin oxide (ITO) microelectrodes were fabricated and used to immobilize suspended NIH 3T3 fibroblast cells by positive dielectrophoresis (DEP). The ITO electrodes facilitated microscopic observation of immobilized cells as compared to metallized electrodes. DEP was used to capture arrays of individual cells and small cell clusters within a microfluidic network. The extent of cellular immobilization (no-cell, single-cell, or multiple-cell capture) directly correlated with the applied voltage and inversely with the flow velocity. Specific conditions yielding predominantly single-cell capture were identified. The viability of immobilized cells was confirmed using fluorescence microscopy. In the second method, silicon microtechnology was used to make silicon microarray sector slides for facilitating high accuracy protein interactions and identifications. Photolithography and anisotropic chemical etching was used for creating pyramid-like array structures in each sector, to increase the sector surface area and hence the concentration of the reactant. The silicon microarrays were coated with different dielectric films to investigate if they improve the presence and relative abundance of specific variants of key signaling molecules. The microarray structures were also modified with a chemical surface coating: 3-metcaptopropyltrimethoxysilane (MPTMS). Competitive binding assays were then used to test the protein binding accuracy and sensitivity of the silicon based microarrays. Native silicon and dielectric layer microarrays produced poor protein molecule capture during Reverse Phase Antibody process. The presence of MPTMS was found to improve the extent of protein immobilization, thereby improving characterization of immobilized proteins on microarray structures.Item Solutions to Constrained Path Computation in Multi-layer Networks(2007-12-17T21:03:11Z) Gong, Shujia; Gong, ShujiaTraffic Engineering methods as applied to traditional IP networks rely on link attributes advertised by link state protocols, such as Open Shortest Path First (OSPF) or Intermediate System to Intermediate System (IS-IS). Extending link state protocols to include heterogeneous transport layer attributes brings a more comprehensive view of networks for path computation. A unified control plane, which enables horizontal cooperation between peer layers and vertical integration across layers, facilitates the optimization of network resource and instantiation of cross-layered paths, yet brings to path computation additional challenges. These include but are not limited to Generalized Label Continuity Constraints, such as wavelength continuity and VLAN (Virtual Local Area Network) tag continuity and Interface Specific Adaptation Constraints such as switching type adaptation constraints when a cross-layered path needs to be setup. These constraints cannot be satisfied by traditional CSPF (Con- strained Shortest Path First) or integer linear programming. Moreover, the network graph may not be enough to describe the connectivity of network resources associated with wavelength, VLAN tag or switching type adaptation capabilities. Furthermore, the dynamic nature of the networks makes all exhaustive search or other NP-hard algorithms practically unattractive. In this dissertation, we provide the Common Vector solution to the Generalized Label Continuity Constraints. Mathematical analysis and simulation results demonstrate that the algorithm addresses the scalability problem of the existing wavelength graph solution, yet only with minor performance degradation from blocking perspective when the traffic load is not high. Especially, when the label space grows fast, the blocking caused by the lack of common labels is further reduced. Link performance bounds in a ring topology, which can help evaluate the performance degradation of common vector solution more accurately, is also discussed. For Interface Specific Adaptation Constraints, we provide the Channel Graph solution, which transforms the network graph to channel graph. We prove that this solution addresses both the optimality and scalability problems of path computation in multi-layer networks. We also prove that with assumption that the connectivity and cost of adaptation depends on switching type associated with an interface, the Channel Graph solution is most efficient. In a sparse network, the Channel Graph solution has the same order of computational complexity as running CSPF on network graph. Simulation results that corroborate those from the analytical models are presented in this dissertation. The solutions to path computation, as discussed here, lend themselves as good candidates for Internet of future. The proposed solutions for switching type adaptation and VLAN tag have also been implemented and verified in practice 1. 1This is done as a part of path computation in Dynamic Resource Allocation in GMPLS Optical Networks (DRAGON) project, an NSF sponsored project, to create dynamic, deterministic, and manageable end-to-end network transport services for high-end e-Science applications.Item Location-based Propagation Modeling for Opportunistic Spectrum Access in Wireless Networks(2007-12-17T21:24:23Z) Erpek, Tugba; Erpek, TugbaWireless spectrum has become a scarce resource due to the exponential growth of the number and type of devices that utilize the electromagnetic spectrum. The cost of long term leasing of spectrum has proven to be a major road block in efficient use of frequency spectrum. Moreover, spectrum measurement studies have shown that substantial portions of the allocated wireless spectrum are highly underutilized. Thus, Dynamic Spectrum Access (DSA) devices have been proposed to be allowed to share the spectrum dynamically between users. The idea is that the DSA devices will continuously scan the spectrum and start to transmit in a channel when a licensed user’s operation is not detected in that channel. Detailed path loss models are needed to calculate the propagation loss between a DSA device transmitter and a licensed user receiver in order for the DSA device to avoid causing interference to the receiver by using overly high transmit power levels. This thesis proposes a novel propagation loss model called Location Based Propagation Modeling (LPM) based on the existing TIREM path loss model. The TIREM model gives the median value of the path loss for a given transmitter and receiver pair and the user needs to know the precise locations of the transmitter and receiver to calculate the path loss with the TIREM model. However, for DSA applications, we usually do not know the precise locations of the licensed user receivers. Furthermore, TIREM model requires detailed terrain information stored in the memory to calculate path loss, but DSA devices have limited memory. As a result, we need a compact representation of the TIREM model which gives the path loss without the need to store terrain information in the memory. These were the motivations to develop the LPM. DSA devices require accurate spectral estimation methods to determine whether a channel is occupied in a specific time and location. Two spectral estimation methods: multitaper spectral estimation method and conventional FFT-based spectral estimation method are compared in this thesis using real signal measurements. Our numerical results show that the multitaper approach yields a significant increase in the number of harvested channels, while maintaining a smaller probability of false alarm.Item Performance of Devices Made of Large Band-gap Semiconductors, SiC and GaN(2007-12-17T21:51:23Z) Okayama, Taizo; Okayama, TaizoSilicon (Si) and gallium arsenide (GaAs) devices have limitations for certain applications such as high-power and/or high-frequency due to their material properties. As a partial fulfillment of the requirements for the degree of doctor of philosophy in electrical and computer engineering, devices made using two promising substrate materials: silicon carbide (SiC) and gallium nitride (GaN) were studied for high-power and high-frequency applications, respectively. The SiC is considered as a suitable material for high-power devices such as double-implanted metal-oxide-semiconductor field-effect-transistor (DMOSFET), in which the current flows vertically to the substrate contact. The DMOSFET consists of several hundred cells connected in parallel, making it possible to sustain both high blocking voltage and high current. GaN grown on SiC is considered as a suitable material for high-frequency and high-power applications. High electron mobility transistor x (HEMT) fabricated with GaN and aluminum gallium nitride (AlGaN) utilizes a conduction band offset and piezoelectric polarization effect at the junction between these two materials to produce a highly conductive channel. However, in spite of their promises, the performance of both SiC DMOSFET and GaN HEMT devices, with respect to their Si and GaAs counterparts are not well understood. In this work, first the SiC DMOSFET devices were characterized for their threshold voltage, drain current and breakdown voltage stability and then GaN devices for their efficiency and linearity performance at high-frequency. The results of SiC DMOSFETs were fitted with simulation to determine the location of the interface charge responsible for instability in device behavior. The charge at the inner region of the junction termination extension has the most pronounced effect on the breakdown voltage instability. The interlayer dielectric (ILD) composition that can minimize the SiC DMOSFET instability problem is also determined considering several limitations on the maximum weight percentages of the boron and phosphorous constituent dopants in the boro-phospho-silicate glass (BPSG) ILD layer. The BPSG with a composition of 2.4 weight percent B and 5 weight percent P is projected as optimum for the processing conditions used for making the SiC DMOSFET of this study. Results of GaN HEMTs were compared with those of GaAs pseudomorphic HEMTs.Item Policy-Controlled Email Services(2007-12-18T16:44:27Z) Kaushik, Saket; Kaushik, SaketThe context for this research proposal is an area of work that seeks to replace the current state of access-control for email, in which an arbitrary message sender enjoys unregulated \append" access to message recipient's email mailbox, with a pol- icy framework, in which each principal involved in a message exchange - namely the sender, the sender's service provider, the recipient's service provider, and the recipient, can articulate its interests for regulating access to resources under its control. Though there exist a vast number of automatic control techniques to limit transmission of email messages, specifically, to stop unwanted messages reaching a recipient, they are still prone to dropping some desirable messages. This often prompts recipients and other principals to relax the message acceptance requirements. This in turn makes them easy targets for sending commercial or fraudulent mail. We propose a novel scheme to overcome this handicap. Our scheme makes the transmission mechanism aware of the documentation required with a message to make it acceptable downstream. For instance, if a recipient wishes to receive only those messages that have a monetary guarantee, also known as a bond, then the transmission system must be made aware of this fact so that desirable messages can satisfy this requirement. Consequently, recipients and other principals can express and enforce precise acceptance requirements, through explicit policies, and gain control over their resources. In addition to the problem of enforcing precise acceptance requirements in the transmission process, there exists no means of flexibly combining available email- control techniques tailored to the needs of a particular recipient or its service provider. This is the primary reason for the inability to express requirements suited to a particular individual. For instance, currently it is not possible to state a requirement like `allow messages, initiated by a human sender affiliated with George Mason University, even though the spam filter ranks them as possible spam'. In our view a policy-based approach is well-suited to attain this objective. The use of these control-techniques leads to significant deviation of behavior from what is prescribed in the current email delivery protocol. In other words, the protocol lacks significantly in representing the current delivery requirements. Clearly, it requires an overhaul to correspond to current requirements and reduce ambiguities during protocol play; a goal that we propose to research in this study. We propose using constraint logic programming (CLP) to articulate and evaluate different types of policies. This is because the way messages are constructed and acceptance conditions are evaluated, a CLP approach seems a natural way to model these operations. In addition, CLP approach promises to simplify the task of providing feedback for rejected messages, so that they can be revised and retransmitted. Since declarative policies can describe control on a very high-level, we also propose to study refinements of these high-level directives to protocol level actions.Item Reliable Bulk Data Dissemination in Sensor Networks(2007-12-18T19:10:49Z) Huang, Leijun; Huang, LeijunA wireless sensor network consists of a large number of resource-constrained sensor nodes that are self-organized into a multi-hop network and cooperate on a single task. In many situations, sensor networks need to run for a long time once deployed. When the environment changes during their lifetime, updating the code image or application data at the node for a new task becomes necessary, thus making data dissemination a critical issue where a large data object needs to be reliably propagated to all of the nodes in a network. While most of the current sensor nodes are equipped with a multiple-channel radio, the existing data dissemination approaches such as Deluge [1] do not take advantage of multiple channels. Moreover, these approaches mostly focus on the object delivery latency, while energy efficiency is also very important due to the resource constraints of the sensor nodes. This dissertation proposes three novel protocols for reliable bulk data dissemination, named McTorrent, CORD and McCORD, that focus on both object delivery latency and energy efficiency. These protocols use multiple channels, or a core-based two-phase approach, or both techniques to reduce object delivery latency and energy consumption at each node. The results from experiments on both indoor and outdoor testbeds and extensive simulations in various scenarios show that these protocols significantly reduce the latency and/or energy consumption, compared to the existing approaches.Item Efficient Inference For Hybrid Bayesian Networks(2007-12-19T16:44:59Z) Sun, Wei; Sun, WeiUncertainty is everywhere in real life so we have to use stochastic model for most real-world problems. In general, both the systems mechanism and the observable measurements involve random noise. Therefore, probability theory and statistical estimation play important roles in decision making. First of all, we need a good knowledge representation to integrate information under uncertainty; then we need to conduct efficient reasoning about the state of the world given noisy observations. Bayesian networks (BNs) provide a compact, efficient and easy-to-interpret way to model the joint probability distribution of random variables over a problem domain. A Bayesian network encodes dependency relationship between random variables into a graphical probabilistic model. The structural properties and expressive power of Bayesian network make it an excellent knowledge base for effective probabilistic inference. Over the past several decades, a number of exact and approximate inference algorithms have been proposed and applied for inference in different types of Bayesian networks. However, in general, BN probabilistic inference is NP-hard. In particular, probabilistic reasoning for BNs with nonlinear non-Gaussian hybrid model is known to be one of the most difficult problems. First, no exact method is possible to compute the posterior distributions in such case. Second, relatively little research has been done for general hybrid models. Unfortunately, most real-world problems are naturally modeled with both categorical variables and continuous variables with typically nonlinear relationship. This dissertation focuses on the hybrid Bayesian networks containing both discrete and continuous random variables. The hybrid model may involve nonlinear functions in conditional probability distributions and the distributions could be arbitrary. I first give a thorough introduction to Bayesian networks and review of the state-of-the-art inference algorithms in the literature. Then a suite of efficient algorithms is proposed to compute the posterior distributions of hidden variables for arbitrary continuous and hybrid Bayesian networks. Moreover, in order to evaluate the performance of the algorithms with hybrid Bayesian networks, I present an approximate analytical method to estimate the performance bound. This method can help the decision maker to understand the prediction performance of a BN model without extensive simulation. It can also help the modeler to build and validate a model effectively. Solid theoretical derivations and promising numerical experimental results show that the research in this dissertation is fundamentally sound and can be applied in various decision support systems.Item An Analysis of Island Models in Evolutionary Computation(2007-12-19T18:54:55Z) Skolicki, Zbigniew; Skolicki, ZbigniewIsland models (IMs) are a class of distributed evolutionary algorithms (EAs) in which the population is split into multiple sub-populations called islands. Separate EAs run independently on each island, but they interact by means of migrating individuals. Therefore, IMs are different both from a single-population standard EA, as well as from a set of multiple isolated EAs. IMs are interesting for several reasons. They have been reported to yield better results than standard EAs. IMs are also advantageous when computational tasks must be distributed across multiple machines because their structure is easy to parallelize. However, despite many studies, no comprehensive theory describing their behavior has been developed. Due to the lack of theory and a complex architecture with many control parameters, setting up IMs has been a trial-and-error process, guided mostly by “rules of thumb.” In this dissertation, I adopt a two-level (intra- and inter-island) view of IMs and show how this approach makes understanding their dynamics easier. They behave very differently than standard EAs, and in order to take full advantage of this, I propose a better utilization of the inter-island level of evolution. In particular, I argue for setups with many relatively small islands, and I also show that compositional evolution may scale to the inter-island level. The two levels of evolution influence each other, and I analyze this interaction more deeply. Migrations profoundly change the local dynamics and stimulate evolution, which often ultimately results in better performance. I study the role of genetic operators in this behavior and also create mathematical models of after-migration dynamics. This analysis gives us a better understanding of mixing and the survival of genes locally, and these processes in turn determine the type and level of interaction between islands globally. Further, using island heterogeneity enhances the inter-island evolution. Following the study, I analyze IM behavior on a range of test problems, including two complex domains. This dissertation improves our understanding of the dynamics of IMs and suggests a qualitative change in the way we think about them. This perspective offers new guidelines for configuring IM parameters and opens new directions for future work.Item Abstraction of Reasoning For Problem Solving and Tutoring Assistants(2008-03-25T19:56:59Z) Le, Vu; Le, VuThis dissertation presents an approach to the abstraction of the reasoning of a knowledge-based agent that facilitates human-agent collaboration in complex problem solving and decision-making and the development of systems for tutoring expert problem solving to non-experts. Effective human-agent collaboration requires an ability of the user to easily understand the complex reasoning generated by the agent. The methods presented in this dissertation allow the partition of a complex reasoning tree into meaningful and manageable sub-trees, the abstraction of individual sub-trees, and the automatic generation of an abstract tree that plays the role of a table of contents for the display, understanding and navigation of the concrete tree. Abstraction of reasoning is also very important for teaching complex problem-solving to non-experts. This dissertation presents a set of integrated methods that allow the abstraction of complex reasoning trees to define abstract problem solving strategies for tutoring, the rapid development of lesson scripts for teaching these strategies to nonexperts, and the automatic generation of domain-specific lessons. These methods are augmented with ones for learning and context-sensitive generation of omission, modification, and construction test questions, to assess a student’s problem solving knowledge. The developed methods have been implemented as an extension of the Disciple learning agent shell and have led to the development of the concept of learning and tutoring agent shell. This is a general tool for building a new type of intelligent assistants that can learn complex problem solving expertise directly from human experts, support human experts in problem solving and decision making, and teach their problem solving expertise to non-experts. The developed learning and tutoring shell has been used to build a prototype tutoring system in the intelligence analysis domain which has been used and evaluated in courses at the US Army War College and George Mason University.Item Model-Based Testing for Software Product Lines(2008-05-16T19:42:21Z) Olimpiew, Erika Mir; Olimpiew, Erika MirA Software Product Line (SPL), or family of systems, is a collection of applications that have so many features in common that it is worthwhile to study and analyze the common features as well as analyzing the features that differentiate these applications. Model-based design and development for SPLs extends modeling concepts for single applications to model the commonality and variability among the members of the SPL. Previous research on model-based functional testing methods for SPLs use existing requirement models, such as feature and use case models, to create reusable test specifications that can be configured for applications derived from a SPL. Feature-based test coverage criteria can be applied to determine what applications to test, when it is not feasible to test all possible applications of a SPL. However, previous research on functional testing methods for SPLs does not apply feature-based test coverage criteria together with a use case-based approach of creating reusable test specifications for a SPL. This research describes a functional test design method for SPLs (Customizable Activity diagrams, Decision tables and Test specifications, or CADeT) that applies feature-based test coverage criteria together with a use case-based approach of creating reusable test specifications for a SPL. Features from a feature model are associated with test models created from the use cases of a SPL using feature condition variables. The values of a feature condition represent possible feature selections, so that selecting a value for the feature condition selects and customizes the test models associated with that feature. With CADeT, activity diagrams are created from the use case descriptions of a SPL. Reusable test specifications are traced from the use case activity diagrams and described in decision tables. The relationships of features to activity diagrams are also portrayed in decision tables, and then analyzed to apply a feature-based test coverage criterion to the SPL. Representative applications configurations are generated to cover all features, all use case scenarios, and all relevant feature combinations of a SPL. Reusable test specifications are selected and customized for each application configuration, and then used to test the corresponding application implementation. Furthermore, CADeT is extended to use separation of concerns to customize the reusable test specifications during feature-based test derivation (CADeT-SoC). Instead of using feature conditions to customize these test specifications, CADeT-SoC separates the variable test steps from the test specifications, and then weaves selected test steps with these test specifications during feature-based test derivation. CADeT-SoC is more suitable than CADeT for customizing the test specifications of a SPL with many variation points repeated across several use cases. Using CADeT-SoC reduced the effort needed to define variable test steps for the variation points in test specifications in each SPL. The feasibility of the CADeT and CADeT-SoC methods was evaluated in three studies on two SPLs: an Automated Highway Toll System (AHTS) SPL, and a Banking System SPL. The results of these studies show that CADeT and CADeT-SoC can be used to create reusable test specifications to cover all use case scenarios, all features, and all relevant feature combinations on each of these two SPLs. The feature model of each SPL, and the relationships of features to test specifications were analyzed to determine the relevant feature combinations, and a feature-based coverage criterion was applied to reduce the number of application configurations to test. Using CADeT also reduced the number of test specifications needed to satisfy these criteria, as compared with using two alternative approaches. The contribution of this research is CADeT, a model-based test specification design method, and CADeT-SoC, an extension of CADeT that uses separation of concerns to customize the test specifications for an application derived from the SPL. CADeT and CADeT-SoC can help a test engineer create reusable test specifications to cover all use case scenarios, features and relevant feature combinations of a SPL. These test specifications can be customized during feature-based test derivation for a set of applications derived from a SPL. Using CADeT and CADeT-SoC reduces the number of application configurations and test specifications that need to be created to cover all use case scenarios, features and relevant feature combinations in a SPL.Item Service Assurance in Insecure Networks with Byzantine Adversaries(2008-06-09T17:24:41Z) Rabinovich, Paul; Rabinovich, PaulThis dissertation describes research into security threats in new communication environ- ments and methods to counter them. More specifically, we consider networks where some nodes perform an important application function such as routing, filtering, aggregation, etc., and may become compromised while doing so. We consider three types of networks: Internet-scale publish/subscribe networks, ag- gregating sensor networks, and peer-to-peer massively multiplayer online games (MMOG) and virtual environments. These environments are complementary to each other in many respects. A publish/subscribe network is responsible for disseminating data objects pro- duced by one set of nodes (publishers) to another set of nodes (subscribers). Subscribers want to receive those and only those objects that satisfy their interests. A large-scale pub- lish/subscribe network using many network providers, each under its own administrative control, presents numerous opportunities for mischief. In many cases the network providers will not trust one another and will not be fully trusted by the end users. A malicious network provider may insert, delete, modify, reorder, misdirect, or delay messages, and re- main undetected. The first topic of our research is how to assure service integrity in such networks if some of the intermediate nodes may attack the system in an arbitrary fashion. We solve the problem by creating filtering agents corresponding to user subscriptions, and mapping these agents to either hosts from trusted providers or to clusters of hosts taken from multiple non-trusted providers. As a whole, each cluster can be trusted since only a relatively small number of providers are assumed to be malicious. A sensor network is a network connecting hundreds or thousands of sensors, tiny battery- powered computers equipped with units measuring some physical phenomena (e.g., light intensity, temperature, humidity, ambient chemical composition, etc.) and wireless ra- dio transceivers. We study aggregating sensor networks that do not fully propagate raw measurements to their users but rather perform in-network aggregation with the intent of lowering the total amount of transmitted data thus preserving bandwidth and energy. As sensor networks are frequently placed in hostile environments (for instance, in military applications) it is important to devise mechanisms guaranteeing integrity of their service under attack, and their survivability. In this dissertation we discuss CoDeX, a collect-detect- exclude framework for secure aggregation in sensor networks. Our approach to solving the problem is based on the fact that many physical phenomena exhibit strong spatial corre- lation. Sensor nodes can take advantage of the broadcast nature of radio transmissions, receive measurements from their neighbors, and compare them with their own results. If the values significantly differ, the fact can be reported to the user (the "collect" phase). If a node is a subject of many such reports, it is, probably, compromised (the "detect" phase) and should be removed from the network (the "exclude" phase). We complement this approach with the use of randomized delivery (aggregation) trees, cryptography, and repeated aggregation of the same data in different configurations Peer-to-peer massively multiplayer online games and virtual environments pose chal- lenges similar to those in the other two types of networks: all three lack centralized control and run autonomously. Like pub-sub networks, P2P-based MMOGs may span the Inter- net and contain thousands and, potentially, hunderds of thousands of nodes. Like sensor networks, these systems almost completely rely on end nodes for providing infrastructure services. Unfortunately, most games do not provide sufficient safeguards against cheating and fraud perpetrated by the players. We developed FRAPPE, an architecture that significantly reduces this vulnerability by forming trusted "supernodes" out of non-trusted peer machines, employing authentication, confidentiality and pseudonymity services, using secret sharing and other secure multiparty computation techniques, and constructing anonymizing tunnels to hide identities of communicating parties. We also introduce a useful primitive, called local broadcast with verification(LBV), used to solicit services of peers in a particular neighborhood and subsequently verify that the peers properly executed the protocol. Using a combination of analysis and experimental results we demonstrate that all three approaches provide strong service assurance guarantees for their respective types of net- works.Item A Framework and Methodology for Ontology Mediation through Semantic and Syntactic Mapping(2008-06-09T19:11:21Z) Muthaiyah, Saravanan; Muthaiyah, SaravananOntology mediation is the process of establishing a common ground for interoperability between domain ontologies. Ontology mapping is the task of identifying concept and attribute correspondences between ontologies through a matching process. Ontology mediation and mapping enable ontologists to borrow and reuse rich schema definitions from existing domain ontologies that have already been developed by other ontologists. For example, a white wine distributor could maintain a white wine ontology that only has white wine concepts. This distributor may then decide at some point in the future to include other wine classifications as well in his current ontology. Instead of creating red wine or desert wine concepts in his existing ontology, the distributor could just borrow these concepts from existing red wine and desert wine ontologies. As such ontology mapping becomes necessary. The practice of matching ontology schemas today is one that is labor-intensive. Although semi-automated systems have been introduced, they are based on syntactic matching algorithms which do not produce reliable results. Thus my thesis statement is that the hybrid approach i.e., Semantic Relatedness Score (SRS), which combines both semantic and syntactic matching algorithms, provides better results in terms of greater reliability and precision when compared to pure syntactic matching algorithms. This research validates that SRS provides higher precision and relevance compared to syntactic matching techniques that have been used previously. SRS was developed through the process of rigorously testing thirteen well established matching algorithms and choosing a composite measure of the best combination of five out of those thirteen measures. This thesis also provides an end-to-end approach by providing a framework, process methodology and architecture for the process of ontology mediation. Since implementing a fully automated system without any human intervention would not be a realistic goal, a semi-automated approach is undertaken in this thesis. In this approach, an ontologist is assisted by a mapping system which selects the best candidates to be matched from the source and target ontology using SRS. The goal was not only to reduce the workload of the ontologist, but also provide results that are reliable. Literature survey on current ontology mediation research initiatives such as InfoSleuth, XMapper, ONION, FOAM, FCA-Merge, KRAFT, CHIMERA, PROMPT and OBSERVER, among others, revealed that the state-of-art of ontology mediation is to a large extent based on mainly syntactic schema matching that supported binary schema matches (1:1) only. A generic solution for schema matching based on SRS is presented in this thesis to overcome these limitations. A similarity matrix for concept similarity measures is introduced based on several cognitive and quantitative techniques such as computational linguistics, Latent Semantic Analysis (LSA), distance vectors and lexical databases (WordNet). The six-part matching algorithm is used to analyze RDF, OWL and XML schemas and to provide a similarity scores which are then used to populate a similarity matrix. The contribution here is twofold. Firstly, this approach gives a composite similarity metric and also supports complex mappings (1:n, 1:m, m:1 and n:m). Secondly, it provides higher relevance, reliability and precision. The validation of this approach is demonstrated by comparing SRS results with that of human domain experts. Empirical evidence provided in this document clearly shows that the hybrid method resulted in a higher correlation, better relevance and more reliable results than purely syntactic matching systems. Predefined Semantic Web Rule Language (SWRL) rules are also introduced to concatenate attributes, discover new relations and enforce the assertion box (ABox) instances. Reasoning for consistency, coherence, ontology classification and inference measures are also introduced. An actual implementation of this framework and process methodology for the mapping of security policy ontologies (SPRO) is provided as a case study. Another case study on achieving interoperability for e-government services with SWRL rules is also presented. Both SRS and SWRL rules are highlighted in this document as being complementary measures for the process of semantic bridging. Several tools were used for a proof-of-concept for the implementation of the methodology, including Protégé, Racer Pro, Rice and PROMPT.Item Applying Decomposition Methods to Solve a Stochastic Available-To-Promise Problem(2008-06-18T15:38:45Z) Pangarad, Arm; Pangarad, ArmThe available-to-promise (ATP) model is a mechanism that provides recommendations about when to accept customer orders that takes into account both product availability information, current customer orders and future orders in order to maximize overall profits. It becomes an important tool in a decision making process for manufacturing businesses. In this thesis, we present the stochastic available-to-promise problem which addresses the problem of needing to accept-or-reject in real-time orders for customizable computer configurations where the manufacturer cannot predict when the most profitable customers might arrive, but does have some probabilistic information about the likelihood of order arrivals and their requirements. Because the problem is stochastic, modeling all possible future scenarios results in an exponentially large problem. Even when one limits the total number of scenarios considered, solving the problem by off-the-shelf commercial solvers such as CPLEX results in computation times that are too large to be usable. We study the underlying structure of the model and propose decomposition methods to solve it. We test both a Dantzig-Wolfe decomposition (“Column-generation approach) and a Bender’s decomposition (a row-oriented decomposition). We compare solution times of both methods to solution times of CPLEX.Item Scalable Role & Organization Based Access Control and Its Administration(2008-06-27T16:51:44Z) Zhang, Zhixiong; Zhang, ZhixiongIn Role Based Access Control (RBAC), roles are typically created based on job functions inside an organization. Traditional RBAC does not scale up well for modeling security policies spanning multiple organizations. To solve this problem, a family of extended RBAC models called Role and Organization Based Access Control (ROBAC) models and its administrative models are proposed and formalized in this dissertation. Two examples are used to motivate and demonstrate the usefulness of ROBAC. Comparison between ROBAC and other RBAC extensions are given. I show that ROBAC can significantly reduce the administrative complexities of applications involving a large number of similar organizational units. The applicability and expressive power of ROBAC are discussed. By showing that any given ROBAC model can be modeled by a RBAC model and vice versa, I prove that the expressive power of ROBAC is equal to that of traditional RBAC. A comprehensive role and organization based administrative model called AROBAC07 is developed. It has five sub-models dealing with various administrative tasks in ROBAC. I show that the AROBAC07 model provides an intuitive and controlled way to decentralize administrative tasks in ROBAC based systems. A concept called application compartment (ACom) in ROBAC is introduced and its usage in ROBAC is discussed. AROBAC07 scales up very well for ROBAC based systems involving many organizational units. Two ROBAC variants, manifold ROBAC (ROBAC) and pseudo ROBAC (ROBAC), are presented and formalized. Their corresponding administrative models are also proposed. The usefulness of manifold ROBAC is demonstrated in secure collaboration via a ROBAC based secure collaboration schema which avoids many problems resulted from role-mapping, role-translation, or role exporting. The usefulness of pseudo ROBAC is demonstrated in a web based on-demand movie service case study.