Download (PPT, 438KB)


store.theartofservice.com/the-distributed-computing-toolkit.html

distributed computing

Distributed computing

Distributed computing is a field of computer science that studies distributed systems

Distributed computing

A computer program that runs in a distributed system is called a distributed program, and distributed programming is the process of writing such programs.

Distributed computing

Distributed computing also refers to the use of distributed systems to solve computational problems. In distributed computing, a problem is divided into many tasks, each of which is solved by one or more computers, which communicate with each other by message passing.

Distributed computing Introduction

The word distributed in terms such as “distributed system”, “distributed programming”, and “distributed algorithm” originally referred to computer networks where individual computers were physically distributed within some geographical area

Distributed computing Introduction

There are several autonomous computational entities, each of which has its own local memory.

Distributed computing Introduction

The entities communicate with each other by message passing.

Distributed computing Introduction

In this article, the computational entities are called computers or nodes.

Distributed computing Introduction

A distributed system may have a common goal, such as solving a large computational problem. Alternatively, each computer may have its own user with individual needs, and the purpose of the distributed system is to coordinate the use of shared resources or provide communication services to the users.

Distributed computing Introduction

Other typical properties of distributed systems include the following:

Distributed computing Introduction

The system has to tolerate failures in individual computers.

Distributed computing Introduction

The structure of the system (network topology, network latency, number of computers) is not known in advance, the system may consist of different kinds of computers and network links, and the system may change during the execution of a distributed program.

Distributed computing Introduction

Each computer has only a limited, incomplete view of the system. Each computer may know only one part of the input.

Distributed computing Parallel and distributed computing

Parallel computing may be seen as a particular tightly coupled form of distributed computing, and distributed computing may be seen as a loosely coupled form of parallel computing

Distributed computing Parallel and distributed computing

In parallel computing, all processors may have access to a shared memory to exchange information between processors.

Distributed computing Parallel and distributed computing

In distributed computing, each processor has its own private memory (distributed memory). Information is exchanged by passing messages between the processors.

Distributed computing Parallel and distributed computing

The figure on the right illustrates the difference between distributed and parallel systems

Distributed computing Parallel and distributed computing

The situation is further complicated by the traditional uses of the terms parallel and distributed algorithm that do not quite match the above definitions of parallel and distributed systems; see the section Theoretical foundations below for more detailed discussion. Nevertheless, as a rule of thumb, high-performance parallel computation in a shared-memory multiprocessor uses parallel algorithms while the coordination of a large-scale distributed system uses distributed algorithms.

Distributed computing History

The use of concurrent processes that communicate by message-passing has its roots in operating system architectures studied in the 1960s. The first widespread distributed systems were local-area networks such as Ethernet, which was invented in the 1970s.

Distributed computing History

ARPANET, the predecessor of the Internet, was introduced in the late 1960s, and ARPANET e-mail was invented in the early 1970s. E-mail became the most successful application of ARPANET, and it is probably the earliest example of a large-scale distributed application. In addition to ARPANET, and its successor, the Internet, other early worldwide computer networks included Usenet and FidoNet from 1980s, both of which were used to support distributed discussion systems.

Distributed computing History

The study of distributed computing became its own branch of computer science in the late 1970s and early 1980s. The first conference in the field, Symposium on Principles of Distributed Computing (PODC), dates back to 1982, and its European counterpart International Symposium on Distributed Computing (DISC) was first held in 1985.

Distributed computing Applications

Reasons for using distributed systems and distributed computing may include:

Distributed computing Applications

The very nature of an application may require the use of a communication network that connects several computers: for example, data produced in one physical location and required in another location.

Distributed computing Applications

There are many cases in which the use of a single computer would be possible in principle, but the use of a distributed system is beneficial for practical reasons

Distributed computing Applications

Ghaemi et al. define a distributed query as a query “that selects data from databases located at multiple sites in a network” and offer as an SQL example:

Distributed computing Examples

Examples of distributed systems and applications of distributed computing include the following:

Distributed computing Examples

Telecommunication networks:

Distributed computing Examples

Computer networks such as the Internet

Distributed computing Examples

Distributed information processing systems such as banking systems and airline reservation systems

Distributed computing Examples

Industrial control systems

Distributed computing Examples

Scientific computing, including cluster computing and grid computing and various volunteer computing projects; see the list of distributed computing projects

Distributed computing Examples

Distributed rendering in computer graphics

Distributed computing Models

Many tasks that we would like to automate by using a computer are of question–answer type: we would like to ask a question and the computer should produce an answer. In theoretical computer science, such tasks are called computational problems. Formally, a computational problem consists of instances together with a solution for each instance. Instances are questions that we can ask, and solutions are desired answers to these questions.

Distributed computing Models

Theoretical computer science seeks to understand which computational problems can be solved by using a computer (computability theory) and how efficiently (computational complexity theory)

Distributed computing Models

The field of concurrent and distributed computing studies similar questions in the case of either multiple computers, or a computer that executes a network of interacting processes: which computational problems can be solved in such a network and how efficiently? However, it is not at all obvious what is meant by “solving a problem” in the case of a concurrent or distributed system: for example, what is the task of the algorithm designer, and what is the concurrent or distributed equivalent of a sequential general-purpose computer?

Distributed computing Models

The discussion below focuses on the case of multiple computers, although many of the issues are the same for concurrent processes running on a single computer.

Distributed computing Models

Three viewpoints are commonly used:

Distributed computing Models

All computers have access to a shared memory. The algorithm designer chooses the program executed by each computer.

Distributed computing Models

One theoretical model is the parallel random access machines (PRAM) that are used. However, the classical PRAM model assumes synchronous access to the shared memory.

Distributed computing Models

A model that is closer to the behavior of real-world multiprocessor machines and takes into account the use of machine instructions, such as Compare-and-swap (CAS), is that of asynchronous shared memory. There is a wide body of work on this model, a summary of which can be found in the literature.

Distributed computing Models

The algorithm designer chooses the structure of the network, as well as the program executed by each computer.

Distributed computing Models

Models such as Boolean circuits and sorting networks are used. A Boolean circuit can be seen as a computer network: each gate is a computer that runs an extremely simple computer program. Similarly, a sorting network can be seen as a computer network: each comparator is a computer.

Distributed computing Models

Distributed algorithms in message-passing model

Distributed computing Models

The algorithm designer only chooses the computer program. All computers run the same program. The system must work correctly regardless of the structure of the network.

Distributed computing Models

A commonly used model is a graph with one finite-state machine per node.

Distributed computing Models

In the case of distributed algorithms, computational problems are typically related to graphs. Often the graph that describes the structure of the computer network is the problem instance. This is illustrated in the following example.

Distributed computing An example

Consider the computational problem of finding a coloring of a given graph G. Different fields might take the following approaches:

Distributed computing An example

The graph G is encoded as a string, and the string is given as input to a computer. The computer program finds a coloring of the graph, encodes the coloring as a string, and outputs the result.

Distributed computing An example

Again, the graph G is encoded as a string. However, multiple computers can access the same string in parallel. Each computer might focus on one part of the graph and produce a coloring for that part.

Distributed computing An example

The main focus is on high-performance computation that exploits the processing power of multiple computers in parallel.

Distributed computing An example

The graph G is the structure of the computer network. There is one computer for each node of G and one communication link for each edge of G. Initially, each computer only knows about its immediate neighbors in the graph G; the computers must exchange messages with each other to discover more about the structure of G. Each computer must produce its own color as output.

Distributed computing An example

The main focus is on coordinating the operation of an arbitrary distributed system.

Distributed computing An example

While the field of parallel algorithms has a different focus than the field of distributed algorithms, there is a lot of interaction between the two fields. For example, the Cole–Vishkin algorithm for graph coloring was originally presented as a parallel algorithm, but the same technique can also be used directly as a distributed algorithm.

Distributed computing An example

Moreover, a parallel algorithm can be implemented either in a parallel system (using shared memory) or in a distributed system (using message passing). The traditional boundary between parallel and distributed algorithms (choose a suitable network vs. run in any given network) does not lie in the same place as the boundary between parallel and distributed systems (shared memory vs. message passing).

Distributed computing Complexity measures

In parallel algorithms, yet another resource in addition to time and space is the number of computers

Distributed computing Complexity measures

Perhaps the simplest model of distributed computing is a synchronous system where all nodes operate in a lockstep fashion

Distributed computing Complexity measures

This complexity measure is closely related to the diameter of the network. Let D be the diameter of the network. On the one hand, any computable problem can be solved trivially in a synchronous distributed system in approximately 2D communication rounds: simply gather all information in one location (D rounds), solve the problem, and inform each node about the solution (D rounds).

Distributed computing Complexity measures

On the other hand, if the running time of the algorithm is much smaller than D communication rounds, then the nodes in the network must produce their output without having the possibility to obtain information about distant parts of the network

Distributed computing Complexity measures

Other commonly used measures are the total number of bits transmitted in the network (cf. communication complexity).

Distributed computing Other problems

Traditional computational problems take the perspective that we ask a question, a computer (or a distributed system) processes the question for a while, and then produces an answer and stops

Distributed computing Other problems

There are also fundamental challenges that are unique to distributed computing. The first example is challenges that are related to fault-tolerance. Examples of related problems include consensus problems, Byzantine fault tolerance, and self-stabilisation.

Distributed computing Other problems

A lot of research is also focused on understanding the asynchronous nature of distributed systems:

Distributed computing Other problems

Synchronizers can be used to run synchronous algorithms in asynchronous systems.

Distributed computing Other problems

Logical clocks provide a causal happened-before ordering of events.

Distributed computing Properties of distributed systems

So far the focus has been on designing a distributed system that solves a given problem. A complementary research problem is studying the properties of a given distributed system.

Distributed computing Properties of distributed systems

The halting problem is an analogous example from the field of centralised computation: we are given a computer program and the task is to decide whether it halts or runs forever. The halting problem is undecidable in the general case, and naturally understanding the behaviour of a computer network is at least as hard as understanding the behaviour of one computer.

Distributed computing Properties of distributed systems

However, there are many interesting special cases that are decidable

Distributed computing Coordinator Election

In order to perform coordination, distributed systems employ the concept of coordinators. The coordinator election problem is to choose a process from among a group of processes on different processors in a distributed system to act as the central coordinator. Several central coordinator election algorithms exist.

Distributed computing Bully algorithm

When using the Bully algorithm, any process sends a message to the current coordinator. If there is no response within a given time limit, the process tries to elect itself as leader.

Distributed computing Chang and Roberts algorithm

The Chang and Roberts algorithm (or “Ring Algorithm”) is a ring-based election algorithm used to find a process with the largest unique identification number .

Distributed computing Architectures

Various hardware and software architectures are used for distributed computing. At a lower level, it is necessary to interconnect multiple CPUs with some sort of network, regardless of whether that network is printed onto a Circuit Board or made up of loosely coupled devices and cables. At a higher level, it is necessary to interconnect processes running on those CPUs with some sort of communication system.

Distributed computing Architectures

Distributed programming typically falls into one of several basic architectures or categories: client–server, 3-tier architecture, n-tier architecture, distributed objects, loose coupling, or tight coupling.

Distributed computing Architectures

Client–server: Smart client code contacts the server for data then formats and displays it to the user. Input at the client is committed back to the server when it represents a permanent change.

Distributed computing Architectures

3-tier architecture: Three tier systems move the client intelligence to a middle tier so that stateless clients can be used. This simplifies application deployment. Most web applications are 3-Tier.

Distributed computing Architectures

n-tier architecture: n-tier refers typically to web applications which further forward their requests to other enterprise services. This type of application is the one most responsible for the success of application servers.

Distributed computing Architectures

highly coupled (clustered): refers typically to a cluster of machines that closely work together, running a shared process in parallel. The task is subdivided in parts that are made individually by each one and then put back together to make the final result.

Distributed computing Architectures

Peer-to-peer: an architecture where there is no special machine or machines that provide a service or manage the network resources. Instead all responsibilities are uniformly divided among all machines, known as peers. Peers can serve both as clients and servers.

Distributed computing Architectures

Space based: refers to an infrastructure that creates the illusion (virtualization) of one single address-space. Data are transparently replicated according to application needs. Decoupling in time, space and reference is achieved.

Distributed computing Architectures

Another basic aspect of distributed computing architecture is the method of communicating and coordinating work among concurrent processes. Through various message passing protocols, processes may communicate directly with one another, typically in a master/slave relationship. Alternatively, a “database-centric” architecture can enable distributed computing to be done without any form of direct inter-process communication, by utilizing a shared database.

Distributed computing Further reading

Coulouris, George et al (2011), Distributed Systems: Concepts and Design (5th Edition), Addison-Wesley ISBN 0-132-14301-1.

Distributed computing Further reading

Attiya, Hagit and Welch, Jennifer (2004), Distributed Computing: Fundamentals, Simulations, and Advanced Topics, Wiley-Interscience ISBN 0-471-45324-2.

Distributed computing Further reading

Faber, Jim (1998), Java Distributed Computing, O’Reilly: Java Distributed Computing by Jim Faber, 1998

Distributed computing Further reading

Garg, Vijay K. (2002), Elements of Distributed Computing, Wiley-IEEE Press ISBN 0-471-03600-5.

Distributed computing Further reading

Tel, Gerard (1994), Introduction to Distributed Algorithms, Cambridge University Press

Distributed computing Further reading

Keidar, Idit; Rajsbaum, Sergio, eds. (2000–2009), “Distributed computing column”, ACM SIGACT News.

Distributed computing Further reading

Birrell, A. D.; Levin, R.; Schroeder, M. D.; Needham, R. M. (April 1982). “Grapevine: An exercise in distributed computing”. Communications of the ACM 25 (4): 260–274. doi:10.1145/358468.358487. edit

Distributed computing Further reading

C. Rodríguez, M. Villagra and B. Barán, Asynchronous team algorithms for Boolean Satisfiability, Bionetics2007, pp. 66–69, 2007.

Contents – Distributed computing

The first record was achieved on September 16, 2007, as the project surpassed one petaFLOPS, which had never been previously been attained by a distributed computing network

Parallel computing – Distributed computing

A distributed computer (also known as a distributed memory multiprocessor) is a distributed memory computer system in which the processing elements are connected by a network. Distributed computers are highly scalable.

Object (computer science) – In distributed computing

The definition of an object as an entity that has a distinct identity, state, and behavior, and, it is claimed, the principle of encapsulation, can be carried over to the realm of distributed computing. Paradoxically, encapsulation does not extend to an object’s behavior since they (or even their method names) are not serialized along with the data. A number of extensions to the basic concept of an object have been proposed that share these common characteristics:

Object (computer science) – In distributed computing

Distributed objects are “ordinary” objects (objects in the usual sense – i.e. not OOP objects) that have been set up at a number of distinct remote locations, and communicate by exchanging messages over the network. Examples include web services and DCOM objects.

Object (computer science) – In distributed computing

Protocol objects are components of a protocol stack that enclose network communication within an object-oriented interface.

Object (computer science) – In distributed computing

Replicated objects are groups of distributed objects (called replicas) that run a distributed multi-party protocol to achieve high consistency between their internal states, and that respond to requests in a coordinated way. Referring to the group of replicas jointly as an object reflects the fact that interacting with any of them exposes the same externally visible state and behavior[dubious – discuss]. Examples include fault-tolerant CORBA objects.

Object (computer science) – In distributed computing

Live distributed objects (or simply live objects) generalize the replicated object concept to groups of replicas that might internally use any distributed protocol, perhaps resulting in only a weak consistency between their local states.

Object (computer science) – In distributed computing

Some of these extensions, such as distributed objects and protocol objects, are domain-specific terms for special types of “ordinary” objects used in a certain context (such as remote invocation or protocol composition)

International Symposium on Distributed Computing – History

DISC dates back to 1985, when it began as a biannual Workshop on Distributed Algorithms on Graphs (WDAG); it became annual in 1989. The name changed to the present one in 1998.

International Symposium on Distributed Computing – History

While the first WDAG was held in Ottawa, Canada in 1985, since then WDAG/DISC has been organised primarily in European locations, one exception being WDAG 1992 in Haifa, Israel. In September 2010, DISC returned to North America for the first time since 1985: 24th DISC took place in Cambridge, Massachusetts, USA. In the same year, its North American sister conference PODC was held in Europe (Zurich) for the first time in its history.

Symposium on Principles of Distributed Computing – Scope and related conferences

Work presented at PODC typically studies theoretical aspects of distributed computing, such as the design and analysis of distributed algorithms. The scope of PODC is similar to the scope of International Symposium on Distributed Computing (DISC), with the main difference being geographical: DISC is usually organised in European locations, while PODC has been traditionally held in North America. The Edsger W. Dijkstra Prize in Distributed Computing is presented alternately at PODC and at DISC.

Symposium on Principles of Distributed Computing – Scope and related conferences

Other closely related conferences include ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), which – as the name suggests – puts more emphasis on parallel algorithms than distributed algorithms. PODC and SPAA have been co-located in 1998, 2005, and 2009.

Symposium on Principles of Distributed Computing – Reputation and selectivity

PODC is often mentioned to be one of the top conferences in the field of distributed computing. In the 2007 Australian Ranking of ICT Conferences, PODC was the only conference in the field that received the highest ranking, “A+”.

Symposium on Principles of Distributed Computing – Reputation and selectivity

During the recent years 2004–2009, the number of regular papers submitted to PODC has fluctuated between 110 and 224 each year. Of these submissions, 27–40 papers have been accepted for presentation at the conference each year; acceptance rates for regular papers have been between 16% and 31%.

Symposium on Principles of Distributed Computing – History

PODC was first organised on 18–20 August 1982, in Ottawa, Canada. PODC was part of the Federated Computing Research Conference in 1996, 1999 and 2011.

Symposium on Principles of Distributed Computing – History

Between 1982 and 2009, PODC was always held in a North American location – usually in USA or Canada, and once in Mexico. In 2010, PODC was held in Europe for the first time in its history, and in the same year, its European sister conference DISC was organised in the USA for the first time in its history. PODC 2010 took place in Zürich, Switzerland, and DISC 2010 took place in Cambridge, Massachusetts.

Symposium on Principles of Distributed Computing – History

Since 2000, a review of the PODC conference appears in the year-ending issue of the ACM SIGACT News Distributed Computing Column. The review is usually written by the winner or winners of the Best Student Paper Award at the conference.

Distributed computing – Introduction

• There are several autonomous computational entities, each of which has its own local memory.

Distributed computing – Introduction

• The entities communicate with each other by message passing.

Distributed computing – Introduction

• The system has to tolerate failures in individual computers.

Distributed computing – Introduction

• The structure of the system (network topology, network latency, number of computers) is not known in advance, the system may consist of different kinds of computers and network links, and the system may change during the execution of a distributed program.

Distributed computing – Introduction

• Each computer has only a limited, incomplete view of the system. Each computer may know only one part of the input.

Distributed computing – Parallel and distributed computing

• In parallel computing, all processors may have access to a shared memory to exchange information between processors.

Distributed computing – Parallel and distributed computing

• In distributed computing, each processor has its own private memory (distributed memory). Information is exchanged by passing messages between the processors.

Distributed computing – Applications

1. The very nature of an application may require the use of a communication network that connects several computers: for example, data produced in one physical location and required in another location.

Distributed computing – Applications

2. There are many cases in which the use of a single computer would be possible in principle, but the use of a distributed system is beneficial for practical reasons

Distributed computing – Examples

• Telecommunication networks:

Distributed computing – Examples

? Computer networks such as the Internet

Distributed computing – Examples

? Massively multiplayer online games and Virtual Reality communities

Distributed computing – Examples

? Distributed information processing systems such as banking systems and airline reservation systems

Distributed computing – Examples

? Industrial control systems

Distributed computing – Examples

• Parallel computation:

Distributed computing – Examples

? Scientific computing, including cluster computing and grid computing and various volunteer computing projects; see the list of distributed computing projects

Distributed computing – Examples

? Distributed rendering in computer graphics

Distributed computing – Models

• All computers have access to a shared memory. The algorithm designer chooses the program executed by each computer.

Distributed computing – Models

• One theoretical model is the parallel random access machines (PRAM) that are used. However, the classical PRAM model assumes synchronous access to the shared memory.

Distributed computing – Models

• A model that is closer to the behavior of real-world multiprocessor machines and takes into account the use of machine instructions, such as Compare-and-swap (CAS), is that of asynchronous shared memory. There is a wide body of work on this model, a summary of which can be found in the literature.

Distributed computing – Models

• The algorithm designer chooses the structure of the network, as well as the program executed by each computer.

Distributed computing – Models

• Models such as Boolean circuits and sorting networks are used. A Boolean circuit can be seen as a computer network: each gate is a computer that runs an extremely simple computer program. Similarly, a sorting network can be seen as a computer network: each comparator is a computer.

Distributed computing – Models

• The algorithm designer only chooses the computer program. All computers run the same program. The system must work correctly regardless of the structure of the network.

Distributed computing – Models

• A commonly used model is a graph with one finite-state machine per node.

Distributed computing – An example

• The graph G is encoded as a string, and the string is given as input to a computer. The computer program finds a coloring of the graph, encodes the coloring as a string, and outputs the result.

Distributed computing – An example

• Again, the graph G is encoded as a string. However, multiple computers can access the same string in parallel. Each computer might focus on one part of the graph and produce a coloring for that part.

Distributed computing – An example

• The main focus is on high-performance computation that exploits the processing power of multiple computers in parallel.

Distributed computing – An example

• The graph G is the structure of the computer network. There is one computer for each node of G and one communication link for each edge of G. Initially, each computer only knows about its immediate neighbors in the graph G; the computers must exchange messages with each other to discover more about the structure of G. Each computer must produce its own color as output.

Distributed computing – An example

• The main focus is on coordinating the operation of an arbitrary distributed system.

Distributed computing – Other problems

• Synchronizers can be used to run synchronous algorithms in asynchronous systems.

Distributed computing – Other problems

• Logical clocks provide a causal happened-before ordering of events.

Distributed computing – Architectures

• Client–server: Smart client code contacts the server for data then formats and displays it to the user. Input at the client is committed back to the server when it represents a permanent change.

Distributed computing – Architectures

• 3-tier architecture: Three tier systems move the client intelligence to a middle tier so that stateless clients can be used. This simplifies application deployment. Most web applications are 3-Tier.

Distributed computing – Architectures

• n-tier architecture: n-tier refers typically to web applications which further forward their requests to other enterprise services. This type of application is the one most responsible for the success of application servers.

Distributed computing – Architectures

• highly coupled (clustered): refers typically to a cluster of machines that closely work together, running a shared process in parallel. The task is subdivided in parts that are made individually by each one and then put back together to make the final result.

Distributed computing – Architectures

• Peer-to-peer: an architecture where there is no special machine or machines that provide a service or manage the network resources. Instead all responsibilities are uniformly divided among all machines, known as peers. Peers can serve both as clients and servers.

Distributed computing – Architectures

• Space based: refers to an infrastructure that creates the illusion (virtualization) of one single address-space. Data are transparently replicated according to application needs. Decoupling in time, space and reference is achieved.

Distributed computing – Further reading

• Coulouris, George et al (2011), Distributed Systems: Concepts and Design (5th Edition), Addison-Wesley ISBN 0-132-14301-1.

Distributed computing – Further reading

• Attiya, Hagit and Welch, Jennifer (2004), Distributed Computing: Fundamentals, Simulations, and Advanced Topics, Wiley-Interscience ISBN 0-471-45324-2.

Distributed computing – Further reading

• Faber, Jim (1998), Java Distributed Computing, O’Reilly: Java Distributed Computing by Jim Faber, 1998

Distributed computing – Further reading

• Garg, Vijay K. (2002), Elements of Distributed Computing, Wiley-IEEE Press ISBN 0-471-03600-5.

Distributed computing – Further reading

• Tel, Gerard (1994), Introduction to Distributed Algorithms, Cambridge University Press

Distributed computing – Further reading

• Keidar, Idit; Rajsbaum, Sergio, eds. (2000–2009), “Distributed computing column”, ACM SIGACT News .

Distributed computing – Further reading

• Birrell, A. D.; Levin, R.; Schroeder, M. D.; Needham, R. M. (April 1982). “Grapevine: An exercise in distributed computing”. Communications of the ACM 25 (4): 260–274. doi:10.1145/358468.358487. edit

Distributed computing – Further reading

• C. Rodríguez, M. Villagra and B. Barán, Asynchronous team algorithms for Boolean Satisfiability, Bionetics2007, pp. 66–69, 2007.

Distributed Computing Environment

The ‘Distributed Computing Environment’ (DCE) is a software system developed in the early 1990s by a consortium that included Apollo Computer (later part of Hewlett-Packard), IBM, Digital Equipment Corporation, and others

Distributed Computing Environment

DCE was a big step in direction to standardisation of architectures, which were manufacturer dependent before. Transforming the concept in software for different platforms has been given up after a short period. Similar to the OSI model DCE was not granted success, the underlying concepts however prevailed.

Distributed Computing Environment – History

The Distributed Computing Environment is a component of the OSF offerings, along with Motif and Distributed Management Environment (DME).

Distributed Computing Environment – History

By integrating security, RPC and other distributed services on a single official distributed computing environment, OSF could offer a major advantage over SVR4, allowing any DCE-supporting system (namely OSF/1) to interoperate in a larger network.

Distributed Computing Environment – History

The DCE system was, to a large degree, based on independent developments made by each of the partners

Distributed Computing Environment – History

The rise of the Internet, Java (programming language)|Java and web services stole much of DCE’s mindshare through the mid-to-late 1990s, and competing systems such as CORBA muddied the waters as well.

Distributed Computing Environment – History

One of the major uses of DCE today is Microsoft’s Distributed Component Object Model|DCOM and ODBC systems, which use DCE/RPC (in MSRPC) as their network transport layer.

Distributed Computing Environment – History

OSF and its projects eventually became part of The Open Group, which released DCE 1.2.2 under a free software license (the GNU Lesser General Public License|LGPL) on 12 January 2005. DCE 1.1 was available much earlier under the OSF BSD license, and resulted in FreeDCE being available since 2000. FreeDCE contains an implementation of DCOM.

Distributed Computing Environment – Architecture

The largest unit of management in DCE is a ‘cell’

Distributed Computing Environment – Architecture

#The ‘Security Server’ that is responsible for authentication

Distributed Computing Environment – Architecture

#The ‘Cell Directory Server’ (CDS) that is the respository of resources and ACLs and

Distributed Computing Environment – Architecture

#The ‘Distributed Time Server’ that provides an accurate clock for proper functioning of the entire cell

Distributed Computing Environment – Architecture

Modern DCE implementations such as IBM’s are fully capable of interoperating with Kerberos as the security server, LDAP for the CDS and the Network Time Protocol implementations for the time server.

Distributed Computing Environment – Architecture

While it is possible to implement a distributed file system using the DCE underpinnings by adding filenames to the CDS and defining the appropriate ACLs on them, this is not user-friendly

Distributed Computing Environment – Architecture

DCE/DFS is believed to be the world’s only distributed filesystem that correctly implements the full POSIX filesystem semantics, including byte range locking. DCE/DFS was sufficiently reliable and stable to be utilised by IBM to run the back-end filesystem for the 1996 Olympics web site, seamlessly and automatically distributed and edited worldwide in different timezones.

Distributed application – Parallel and distributed computing

and distributed computing may be seen as a loosely coupled form of parallel computing

Distributed application – Parallel and distributed computing

* In parallel computing, all processors may have access to a shared memory to exchange information between processors., Chapter 15. .

Distributed application – Parallel and distributed computing

* In distributed computing, each processor has its own private memory (distributed memory). Information is exchanged by passing messages between the processors.See references in #Introduction|Introduction.

Distributed application – Parallel and distributed computing

The situation is further complicated by the traditional uses of the terms parallel and distributed algorithm that do not quite match the above definitions of parallel and distributed systems; see the section #Theoretical foundations|Theoretical foundations below for more detailed discussion

List of important publications in concurrent, parallel, and distributed computing – Consensus, synchronisation, and mutual exclusion

Synchronising concurrent processes. Achieving Consensus (computer science)|consensus in a distributed system in the presence of faulty nodes, or in a wait-free manner. Mutual exclusion in concurrent systems.

List of important publications in concurrent, parallel, and distributed computing – Consensus, synchronisation, and mutual exclusion

‘Dijkstra: “Solution of a problem in concurrent programming control”’

List of important publications in concurrent, parallel, and distributed computing – Consensus, synchronisation, and mutual exclusion

:This paper presented the first solution to the mutual exclusion problem. Leslie Lamport writes that this work “started the field of concurrent and distributed algorithms”. did not receive the PODC Award or Dijkstra Prize but was nevertheless mentioned twice in the descriptions of the winning papers, in [http://www.podc.org/influential/2002.html 2002] and in [http://www.podc.org/dijkstra/2006.html 2006].

List of important publications in concurrent, parallel, and distributed computing – Consensus, synchronisation, and mutual exclusion

‘Pease, Shostak, Lamport: “Reaching agreement in the presence of faults”’

List of important publications in concurrent, parallel, and distributed computing – Consensus, synchronisation, and mutual exclusion

:These two papers introduced and studied the problem that is nowadays known as Byzantine fault tolerance. The 1980 paper presented the classical lower bound that agreement is impossible if at least 1/3 of the nodes are faulty; it received the Edsger W. Dijkstra Prize in Distributed Computing in 2005. The highly cited 1982 paper gave the problem its present name, and also presented algorithms for solving the problem.

List of important publications in concurrent, parallel, and distributed computing – Consensus, synchronisation, and mutual exclusion

‘Herlihy, Shavit: “The topological structure of asynchronous computation”’

List of important publications in concurrent, parallel, and distributed computing – Consensus, synchronisation, and mutual exclusion

‘Saks, Zaharoglou: “Wait-free k-set agreement is impossible …”’

List of important publications in concurrent, parallel, and distributed computing – Consensus, synchronisation, and mutual exclusion

:. [http://www.cs.brown.edu/~mph/finland.pdf Gödel prize lecture].

List of important publications in concurrent, parallel, and distributed computing – Consensus, synchronisation, and mutual exclusion

:These two papers study wait-free algorithms for generalisations of the consensus problem, and showed that these problems can be analysed by using Topology|topological properties and arguments. Both papers received the Gödel Prize in 2004.

List of important publications in concurrent, parallel, and distributed computing – Foundations of distributed systems

Fundamental concepts such as time and knowledge in distributed systems.

List of important publications in concurrent, parallel, and distributed computing – Foundations of distributed systems

‘Halpern, Moses: “Knowledge and common knowledge in a distributed environment”’

List of important publications in concurrent, parallel, and distributed computing – Foundations of distributed systems

:This paper formalised the notion of “knowledge” in distributed systems, demonstrated the importance of the concept of “Common knowledge (logic)|common knowledge” in distributed systems, and also proved that common knowledge cannot be achieved if communication is not guaranteed. The paper received the Gödel Prize in 1997 and the Edsger W. Dijkstra Prize in Distributed Computing in 2009.

[email protected] – Alternative distributed computing projects

When the project was launched there were few alternative ways of donating computer time to research projects. However, there are now List of distributed computing projects|many other projects that are competing for such time.

Cancer research – Distributed computing

One can share computer time for distributed cancer research projects like Help Conquer Cancer. World Community Grid also had a project called Help Defeat Cancer. Other related projects include the [email protected] and [email protected] projects, which focus on groundbreaking protein folding and protein structure prediction research.

Molecular modeling on GPU – Distributed computing projects

*[ www.gpugrid.net/ GPUGRID] distributed supercomputing infrastructure

Network architecture – Distributed computing

In distinct usage in distributed computing, the term network architecture often describes the structure and classification of a distributed application architecture, as the participating nodes in a distributed application are often referred to as a network

Network architecture – Distributed computing

A popular example of such usage of the term in distributed applications, as well as PVCs (permanent virtual circuits), is the organization of nodes in P2P network|peer-to-peer (P2P) services and networks. P2P networks usually implement overlay networks running over an underlying physical or logical network. These overlay network may implement certain organizational structures of the nodes according to several distinct models, the network architecture of the system.

Network architecture – Distributed computing

Network architecture is a broad plan that specifies everything necessary for two application programs on different networks on an Internet to be able to work together effectively.

List of important publications in computer science – Concurrent, parallel, and distributed computing

Topics covered: concurrent computing, parallel computing, and distributed computing.

Quorum (distributed computing)

A ‘quorum’ is the minimum number of votes that a distributed transaction has to obtain in order to be allowed to perform an operation in a distributed system. A ‘quorum’-based technique is implemented to enforce consistent operation in a distributed system.

Quorum (distributed computing) – Quorum-based techniques in distributed database systems

Quorum-based voting can be used as a Replication (computer science)#Database replication|replica control method

Quorum (distributed computing) – Quorum-based techniques in distributed database systems

, as well as a commit method to ensure Database transaction|transaction Atomicity (database systems)|atomicity in the presence of network partitioning.

Quorum (distributed computing) – Quorum-based voting in commit protocols

In a distributed database system, a transaction could be executing its operations at multiple sites

Quorum (distributed computing) – Quorum-based voting in commit protocols

Every site in the system is assigned a vote Vi. Let us assume that the total number of votes in the system is V and the abort and commit quorums are Va and Vc, respectively. Then the following rules must be obeyed in the implementation of the commit protocol:

Quorum (distributed computing) – Quorum-based voting in commit protocols

# Before a transaction commits, it must obtain a commit quorum Vc.The total of at least one site that is prepared to commit and zero or more sites waiting \ge Vc.

Quorum (distributed computing) – Quorum-based voting in commit protocols

# Before a transaction aborts, it must obtain an abort quorum VaThe total of zero or more sites that are prepared to abort or any sites waiting \ge Va.

Quorum (distributed computing) – Quorum-based voting in commit protocols

The first rule ensures that a transaction cannot be committed and aborted at the same time. The next two rules indicate the votes that a transaction has to obtain before it can terminate one way or the other.

Quorum (distributed computing) – Quorum-based voting for replica control

In replicated databases, a data object has copies present at several sites. To ensure serializability, no two transactions should be allowed to read or write a data item concurrently. In case of replicated databases, a quorum-based replica control protocol can be used to ensure that no two copies of a data item are read or written by two transactions concurrently.

Quorum (distributed computing) – Quorum-based voting for replica control

The quorum-based voting for replica control is due to [Gifford, 1979]

Quorum (distributed computing) – Quorum-based voting for replica control

. Each copy of a replicated data item is assigned a vote. Each operation then has to obtain a read quorum (Vr) or a write quorum (Vw) to read or write a data item, respectively. If a given data item has a total of V votes, the quorums have to obey the following rules:

Performance tuning – Distributed computing

Distributed computing is used for increasing the potential for parallel execution on modern CPU architectures continues, the use of distributed systems is essential to achieve performance benefits from the available Parallel computing|parallelism. High performance Computer cluster|cluster computing is a well known use of distributed systems for performance improvements.

Performance tuning – Distributed computing

Distributed computing and clustering can negatively impact latency while simultaneously increasing load on shared resources, such as database systems. To minimize latency and avoid bottlenecks, distributed computing can benefit significantly from distributed cache (computing)|caches.

Cell microprocessor – Distributed computing

The first record was achieved on September 16, 2007, as the project surpassed one FLOPS|petaFLOPS, which had never been previously been attained by a distributed computing network

[email protected] – Comparison to similar distributed computing projects

There are several distributed computed projects which have study areas similar to those of [email protected], but differ in their research approach:

Distributed programming – Parallel and distributed computing

and distributed computing may be seen as a loosely coupled form of parallel computing

List of distributed computing projects

This is a list of distributed computing and grid computing projects. For each project, donors volunteer computing time from personal computers to a specific cause. This donated computing power comes typically from CPUs and GPUs (AMD or Nvidia). [email protected] once also harnessed the power of PlayStation 3s. Each project seeks to solve a problem which is difficult or infeasible to tackle using other methods.

List of distributed computing projects – Grid computing projects

While distributed computing functions by dividing a complex problem among diverse and independent computer systems and then combine the result, grid computing works by utilizing a network of large pools of high-powered computing resources.

List of distributed computing projects – Grid computing infrastructure

* BREIN uses the Semantic Web and multi-agent systems to build simple and reliable grid computing|grid systems for business, with a focus on engineering and logistics management.

List of distributed computing projects – Grid computing infrastructure

* A-Ware is developing a stable, supported, commercially exploitable, high quality technology to give easy access to grid resources.www.a-ware-project.eu/

List of distributed computing projects – Grid computing infrastructure

* AssessGrid addresses obstacles to wide adoption of grid technologies by bringing risk management and assessment to this field, enabling use of grid computing in business and society.www.assessgrid.eu/

List of distributed computing projects – Grid computing infrastructure

* Cohesion Platform– A Java-based modular peer-to-peer multi-application desktop grid computing platform for irregularly structured problems developed at the University of Tübingen (Germany)http://www.cohesion.de/cms

List of distributed computing projects – Grid computing infrastructure

* The European Grid Infrastructure (EGI)– A series of projects funded by the European Commission which links over 70 institutions in 27 European countries to form a multi-science computing grid infrastructure for the European Research Area, letting researchers share computer resources

List of distributed computing projects – Grid computing infrastructure

* GridECON takes a user-oriented perspective and creates solutions to grid challenges to promote widespread use of grids.www.gridecon.eu/

List of distributed computing projects – Grid computing infrastructure

* neuGRID develops a new user-friendly grid-based research e-infrastructure enabling the European neuroscience community to perform research needed for the pressing study of degenerative brain diseases, for example, Alzheimer’s disease.

List of distributed computing projects – Grid computing infrastructure

* OMII-Europe– An EU-funded project established to source key software components that can interoperate across several heterogeneous grid middleware platforms

List of distributed computing projects – Grid computing infrastructure

* OurGrid aims to deliver grid technology that can be used today by current users to solve present problems. To achieve this goal, it uses a different trade-off compared to most grid projects: it forfeits supporting arbitrary applications in favor of supporting only bag-of-tasks applications.

List of distributed computing projects – Grid computing infrastructure

* ScottNet NCG– A distributed neural computing grid

List of distributed computing projects – Grid computing infrastructure

* Legion (software)|Legion– A grid computing platform developed at the University of Virginia

List of distributed computing projects – Physical infrastructure projects

These projects attempt to make large physical computation infrastructures available for researchers to use:

List of distributed computing projects – Physical infrastructure projects

* Debian Cluster Components

List of distributed computing projects – Physical infrastructure projects

* DiaGrid (distributed computing network)|DiaGrid grid computing network centered at Purdue University

List of distributed computing projects – Physical infrastructure projects

* [http://www.sara.nl/userinfo/lisa/usage/batch/index.html SARA Computing and Networking Services in Netherlands]

Influenza research – Distributed computing

[email protected], a distributed computing program from Stanford University, is researching how viruses pass through the cell membrane (whereas most treatments focus on preventing viral replication) and what role proteins have. They are currently focussing their research on influenza.

Influenza research – Distributed computing

[email protected] is working on a Spanish flu inhibitor to block the flu infection.

For More Information, Visit:

store.theartofservice.com/the-distributed-computing-toolkit.html

store.theartofservice.com/the-distributed-computing-toolkit.html