TEAM: Tightening knowledge sharing in distributed software communities

September 2006–August 2009

#1. Project description

##Project Summary

The TEAM project addresses the need for a knowledge sharing environment with advanced capabilities suitable for the distributed engineering and management of software systems. The TEAM project aims to develop an open-source software system, seamlessly integrated in a software development environment for enabling decentralised, personalised and context-aware knowledge sharing through:

Knowledge Desktop, a component embedded in the software development environment that provides a graphical interface for knowledge manipulation, as well as an integration platform for other knowledge sharing component·
Context Observer, a component that enables capturing a software developer’s behaviour in his desktop environment (i.e. context elicitation), its analysis (i.e. context processing), as well as triggering the knowledge manipulation mechanism (i.e. knowledge acquisition and delivery) ·
History Analyser, a component that enables elicitation of a software developer’s profile from his previous behaviour
Semantic Search, a component that supports context-aware ontology-based proximity search for relevant knowledge items
Semantic Recommendation, a component that enables proactive knowledge delivery depending on the actual working and personal context of the user
Metadata repository that enables efficient structuring and persistent storage of the acquired knowledge, as well as reasoning about its completeness and consistency
P2P Infrastructure for decentralised communication between local Knowledge Desktops of software developers.

##Project Objectives

Working in distributed teams rapidly gains increasing importance for professional software engineering – for instance, in distributed or virtual organizations, in nearshore / offshore settings, when working simultaneously in the software organization and with the customer, when cooperating with a software component supplier, or also in open-source projects. In such settings,communication and diffusion of information becomes more difficult, and sharing the expertise for efficient production of high-quality software requires more time and effort.

The problem lies not only in the geographic distribution of people, but also in often associated phenomena like different working times, different prior experience or working culture, different technical languages or organization internal slang, different working /programming styles, or psychological effects like a stronger “not invented here” attitude when working with external colleagues.[^1] But, apparently, we face not only the emergence of communication bottlenecks through far-distance collaboration; the situation is also made worse because there is obviously a growing need for extensive communication and knowledge sharing. Clearly,the “construction by configuration” approach is becoming the predominant paradigm in software development in the next years [^2] (Component Frameworks and Service Oriented Architecture are two examples for this approach). This means that reuse happens at a higher level of abstraction and complexity than earlier; correct integration of a complex software component, effective working with a powerful software framework, successfully employing a design pattern – all this requires significant background and experience knowledge about the used software artefact, sometimes even context knowledge, e.g., about the application domain.

Further, since the globalization of software production enables new alliances and more effective software creation value chains, the turn-around time for knowledge is decreasing, i.e., new versions come faster, and the experience with old version ages rapidly. [Rus & Lindvall, 2002] identify the following major fields of knowledge which are critical for the success of software projects:

(1) Knowledge about technologies;

(2) Domain knowledge;

(3)Knowledge about local policies and practices; and

(4) “who knows what”knowledge.

The authors also emphasize the importance of team collaboration and knowledge sharing in distributed groups.Altogether, effective experience sharing and efficient re-learning of new features becomes key to success: In a situation where the reuse of all kind of artefacts will be the focal point in Software Engineering (SE), not only the semantic description of the functionalities or characteristics of an artefact, but also the suitable methods to understand and make appropriate use of the artefact will be the key factor for achieving efficiency and ensuring quality of software production.On the other hand, existing approaches for sharing software engineering knowledge focus mostly on the implementation of the Experience Factory concept in the form of knowledge repositories, which has been proved as suitable only for large software companies, since small- and medium-sized ones usually cannot afford to create centralized organisational units for organisational learning. Moreover, for a distributed and for open-source software development, such a centralised approach is certainly not an optimal solution.

Finally, although the Experience Factory is a useful solution for sharing general projects’ experience, even large companies did not achieve the increase in the quality of software by introducing Experience Factories.Indeed, Experience Factory embodies the assumption that, to share and exploit knowledge, it is necessary to implement a process of knowledge-extraction-and-refinement, whose aim is to eliminate all subjective and contextual aspects of knowledge, and create an objective and general representation that can then be reused by other people in a variety of situations (like an experience package). However, the software implementation knowledge is too divergent and implicit so that by mapping it into predefined packages a part of its characteristics that can be important for reuse is lost.For example, the knowledge about how to use a class encompasses the web site the developer visited, the input he posted in the wiki, the errors he made by instantiating variables, etc., in other words, any situation (i.e. working context) in which a user was involved in order to clarify (explicitly or implicitly) the meaning of that class.

However, it is difficult to define a universal representation of knowledge, since not only the last action a user performed in resolving a problem, but moreover all actions he performed in the context of resolving the problem are relevant for the knowledge sharing. Indeed, due to very individual perception of knowledge, it might be the case that the information from a web page was not useful for a user, but it can be useful for another user who resolves the same problem, but probably in a different context (or he has different preferences). These interdependencies between information sources and their implicit relations to the given problem in a given context are exactly what have been loosed in packaging knowledge.

Therefore, instead of monolith, closed packages with distilled knowledge stored in a central repository, flexible structures that link raw information locally (at a user’s desktop), that can be easily accessed and further processed in a certain personal context are needed. Moreover,expecting that a user will perform all these structuring of knowledge manually is not at all realistic. Hence, a new, lightweight, more decentralized, personalized and context-sensitive concept for sharing software development knowledge is needed. Due to the heterogeneity of a distributed community (e.g., regarding its domain knowledge, experience, used vocabulary, etc.), the new concept requires a proper abstraction mechanism in order to facilitate an efficient communication. Finally, the concept should be more seamlessly integrated into software development environments in order to enable knowledge creation with minimal overhead and proactive knowledge delivery.

The TEAM project addresses exactly the need for such a knowledge sharing environment, supporting in that way advanced capabilities in the distributed engineering and management of software systems. In particular, we propose a decentralized, personalized,context-sensitive and semantic-based framework for sharing knowledge about software implementation that is seamlessly integrated into a software development environment (IDE). As proof of concept we will realize the framework within Eclipse environment.

However, the concept is general enough to be principally applied on every phase of the software development process and in every environment. Its distributed and decentralised (peer-to-peer) realisation ensures the robustness and simplifies the maintenance of the communication channels. The comprehensive modelling of situations in which a software developer could require an information support enables automatic discovery and the articulation of his information need (i.e. proactivity) as well as the representation of informal knowledge. On the other hand, a user’s personal context can be learned from his previous behaviour, what ensures personalized information delivery.

Finally, the usage of ontologies as the backbone of the framework supports not only an unambiguous communication by establishing a common-understandable description of knowledge, but more important the proper level of abstraction that enables proximity search.Consequently, the proposed approach ensures the production of more robust, flexible, high-quality software systems. Moreover, it lays foundation for a more efficient usage of knowledge and semantic technologies in the software development process, which is one of the strategic goals of the European vision for Software, Services and Systems[^3] (the so-called S3 Initiative).

As the integration platform we will use Eclipse IDE, although the framework will be realized generic enough in order to be easily applied in other IDEs. In order to achieve this aim we define two supporting objectives:

To create a conceptual model for sharing knowledge in distributed software communities, that can be seamlessly integrated in a software development environment, through:·semantic-based modelling of a local knowledge sharing environment of a software developer[^4] (e.g. ontology-based models of a knowledge artefact, available knowledge sources, etc.) ·semantic-based modelling of the working context in which a user is involved in (e.g. what is his task, which information is presented to him), including the situations in which he could require a knowledge support(e.g. a compile error, reuse of a component)·semantic-based modelling of a user’s preferences regarding knowledge support·semantic-based knowledge manipulation mechanism that enables undisruptive knowledge acquisition, knowledge access based on semantic-driven similarity as well as proactive knowledge delivery·ontology-based knowledge repository that enables representation of informal knowledge by defining semantic dependencies between knowledge artefacts and ·ontology-based peer-to-peer communication model that supports semantically-routed knowledge flow between local knowledge sharing environments, in order to ensure decentralised but synchronised decision making.
To investigate the usefulness of the conceptual model and the knowledge sharing software system in practical scenarios. The TEAM platform will be piloted and evaluated in four different scenarios: (i) a distributed software development case that includes a large company with its offshore partners, (ii) a distributed software development in near-shoring scenarios, (iii) an Open Source Community, and (iv) a middle-sized software company, in order to illustrate the generality of the proposed approach. The evaluation of the project results will not be limited merely to the technical evaluation; rather, it will take into account both organisational and social aspects of the project.We note that the existing standards for representing knowledge on the Web, i.e. the OWL ontology language, will be used as a backbone of our conceptual model. Moreover, the knowledge sharing software system will be implemented on the top of our KAON2 ontology management system[^5] and existing P2P development frameworks.The achievement of the set project objectives will be measured based on the following success and progress indicators. The indicators will be revised and updated in the course of the project in order to reflect the detailed user needs and related technical objectives of the project.Science & Technology objective of the TEAM proposal: To develop an open-source software system, seamlessly integrated in a software development environment for enabling decentralised, personalised and context-aware knowledge sharing

##Partners & Duration

The TEAM consortium consists of 10 partners (companies & universities) located in 8 European countries:

Planet S.A.
Forschungszentrum Informatik an der Universität Karlsruhe
Institute of Communication and Computer Systems, National Technical University of Athens
Technische Universität München
Ecole Polytechnique Fédérale de Lausanne
CIM College d.o.o
Intrasoft International S.A.
Linux Industrial Association
THALES
TXT e-solutions S.p.A

The project starts on September 2006 and its duration is 30 months.
Role of ICCS
The main role of ICCS in TEAM is to design and develop the TEAM Knowledge Desktop. The Knowledge Desktop objectives are: to conceptualize, implement, and test an IDE plug in (realized for Eclipse) which offers a common GUI to Wiki-like knowledge articulation interface, semantic search and recommendation.

[^1]: See also [Dingsoyr et al., 2004].
[^2]: ftp://ftp.cordis.lu/pub/ist/docs/directorate_d/st-ds/visdoc.pdf
[^3]: ftp://ftp.cordis.lu/pub/ist/docs/directorate_d/st-ds/visdoc.pdf
[^4]: We use the term users for developers.
[^5]: http://kaon2.semanticweb.org/