A fundamental paradigm shift is currently taking place in the field of 
computing: due to the miniaturization of computing devices and the 
proliferation of embedded systems, tiny, networked computers can now be easily 
integrated into everyday objects, turning them into smart things. In the 
resulting Internet of Things, physical items are no longer disconnected from 
the virtual world but rather become accessible through computers and other 
networked devices, and can even make use of protocols that are widely deployed 
in the World Wide Web, in a paradigm that we call the Web of Things. 
Eventually, smart things will be able to communicate, analyze, decide, and act 
{ and thereby provide an invisible background assistance that should make life 
more enjoyable, entertaining, and also safer. However, in an environment that 
is populated by hundreds of Web-enabled smart things, it will become 
increasingly difficult for humans to interact with devices that are relevant 
to their current needs, and to find, select, and control them. 

The objective of this thesis is to investigate how human users could be 
enabled to conveniently interact with individual smart objects in their 
surroundings and to interconnect devices and configure the resulting physical 
mashups to perform higher-level tasks on their behalf. To achieve basic 
interoperability between devices, we rely on the World Wide Web with its 
proven protocols and architectural patterns which emphasize scalability, 
generic interfaces, and loose coupling between components. 

As a first step to facilitate the interaction with smart things on top of the 
basic Web principles, we propose the embedding of metadata for automatically 
generating user interfaces for smart devices. Our specific approach enables 
not only the generation of more intuitive graphical widgets but also the 
mapping of interactive components to gesture-based, speech-based, and physical 
interfaces by describing the high-level interaction semantics of smart devices 
instead of specifying purely interface-specific information. The provisioning 
of an interaction mechanism with a smart object is thus reduced to the 
embedding of simple interaction information into the representation of the 
smart thing. Before users can start interacting with a smart device, it must, 
however, first be selected. To permit users to choose which of the many smart 
objects in their surroundings should be involved in an interaction, we propose 
to use technologies for optical image recognition. 

The visual selection of smart things and automatically generated user 
interfaces enable end users to conveniently interact with individual services 
in their surroundings that are embodied as specific physical objects. To 
complement the direct interaction with smart devices, the second part of this 
thesis focuses on more complex use cases where multiple smart objects must 
collaborate to achieve the user's goal. Such situations arise, for instance, 
in home or office automation scenarios, or in smart factories, where machines 
or assembly lines could adjust to better support the operator. 

To put users more in control of entire environments of smart devices, we 
present a system that records interactions between smart things and with 
remote services and displays this data to users in real time. To do this, we 
use an augmented reality overlay on the camera feed of handheld or wearable 
devices such as smartphones and smartglasses.Next, we propose a management 
infrastructure for smart things that makes the services they offer 
discoverable and composeable, and fully integrates them with more traditional 
Web-based information providers. This system enables humans to find and use 
data and functionality provided by physical devices and allows machines to 
support users in finding services within densely populated smart environments 
and even to discover and use required services themselves, on behalf of the 
user. The basis for these applications is a generic mechanism that allows 
smart devices to provide semantic descriptions of the services they offer. 
Specifically, our infrastructure supports the embedding of functional semantic 
metadata into smart things that describes which functionality a concrete 
object provides and how to invoke it. Based on this metadata, a semantic 
reasoning component can find out which composite tasks can be achieved by a 
user's smart environment and can provide instructions about how to reach 
concrete goals, thus enabling the configuration of entire smart environments 
for end users. 

As a concrete use case, we present a platform that applies our proposed 
interaction modes with smart things to automobiles: a mobile application 
recognizes cars, downloads information about them from a back-end server, and 
displays this information - as well as interaction capabilities with the car 
and its services - on the user's interface device. The back-end server 
furthermore exposes functional metadata about the capabilities of individual 
cars to make their services automatically usable within physical mashups. 
Finally, it records client interactions to enable car owners to monitor in 
real time who accesses which kind of data and services on their vehicles. 

The overarching objective of this thesis is to show how current technologies 
could support the interaction of end users with Web-enabled smart devices. To 
achieve this, we make use of a number of technologies from different areas of 
the computer science discipline: A management infrastructure makes smart 
things discoverable for human users and machines and builds upon current 
research in the distributed systems domain. State-of-the-art computer vision 
technologies allow users to select devices in their environment using handheld 
or wearable computers such as smartphones or smartglasses. Novel methods from 
the field of computer-human-interaction enable the embedding of metadata that 
allows for automatically generating user interfaces. Finally, semantic 
technologies enable exible compositions of smart things that collaborate to 
achieve the user's goal.