Interactive methods for graph exploration

In a strategic watch context, visualization of relational data allows transformation, coding and visualization of great data quantities. Access to interactive, adjustable functionalities by the user would facilitate the domination and the precision of the analysis. From this point of view, the VisuGraph tool allows visualization and exploration of relational data, by the way of applicable and controllable methods of analysis. The main interactive VisuGraph functionalities are presented and illustrated, revealing their importance in graph exploration. The user is the heart of the tool; he or she fully controls the representation and directs the analysis according to own needs.


Introduction
Complex and bulky data sources do not facilitate relations identification and relevant tendencies.For the majority of users, useful and exploitable information search constitutes a long and tiresome process.It requires many efforts for data processing specialists who are charged to treat the requests and to generate the ad hoc reports/ratios.In a strategic watch context, analysts must be able to explore large volumes of data in an interactive way.They should be able to study tendencies, to test various approaches and especially to isolate invaluable information bringing a competitive advantage.
Relational data visualization allows transformation, coding and effective chart of great data volumes.This technique offers to the user a clear and readable representation of information, initially difficult of access.Analysis methods of relevant data allowing exploration of the data complete this representation.Thus, the base of any proposal of visualization of relational data tool supposes an interest in the three following fundamental aspects:  nature of the data represented,  way in which the components of the graph are exploited to transcribe these data, Available for free online at https://ojs.hh.se/Journal of Intelligence Studies in Business 2 (2012) 21-31  perception of these components by the user.
The visualization objective is not simply limited to product pre-set charts, which cannot be changed by the user.Indeed, an important criterion for a good visualization tool is the possibility for the user to control the representation in order to include/understand the information space and to interact with the system.Visualization on this point is the concern of the field of man-machine interaction.
In this article, the author presents a model of interactive visualization of relational data, named VisuGraph (Loubier and Dousset 2008).
Communication with the user and the system is the main interest of concern in this context as a proposal of a tool of assistance to the analysis of relational data.This information analysis approach is based on visualization interfaces.It allows data exploration by rich charts and interaction modes adapted to analyst tasks.Visualization's components and navigation help the analyst and more particularly "the watcher", taking part in the technology's development for a framework of economic, strategic and competitive intelligence.
Under the control of an expert, who chooses a suitable mode of representation ("semiologyand esthetics" rules), relational data are represented in graph form.The visualization purpose is to give a precise idea of the data and their relations.The objective is to propose an information representation allowing identification, analysis and restitution of the strategic structures.This system makes it possible to detect different connections and to analyse at a specific time the actors of a field and the concepts which they use.This tool is supplemented by processes of interactive graph analysis, which make it possible for the user to control his or her representation.In this article, we insist on VisuGraph tool's interactive functionalities and particularly on the targeted and comparative study of nodes.Initially, the stakes of data and in particular of cognition visualization are presented, as well as principal application scopes.
In a second phase, the relational data visualization methodology proposed in the context of VisuGraph is exposed.The access of traditional analysis methods makes it possible to isolate particular nodes and obtain paramount information which enriches and facilitates the graphic analysis.A concrete example illustrates the effectiveness of such a tool for visualization.This example makes it possible to analyze complex interactions networks, the actors/fields and evolution by the analyses of graph structure.
It is possible to detect the various tendencies, the strong signals and the weak signals.

Stakes of the visualization of data
Since about fifteen years, under the impulse of researchers like Card, Mackinlay and Shneiderman (1999), Spence (2000) or Ware (2000), information visualization has become a research orientation with gravitas.Many contributions tried to approach available represented formalizations to restore the spacetime processes (Langran 1993; Gayte et al. 1997;Frank, Rapper and Cheylan 2001).Recently geographers and cartographers were also interested with these questions (Passover 2007;Josselin and Fabrikant 2003).
"The principle of information visualization is to use the power of the computer tools to represent effectively, from a cognitive point of view, abstract data which do not have necessarily usual physical representations.These techniques, which aim' at amplifying cognition via perception, aim in particular to facilitate the discovery and the creation of ideas starting from masses of data difficult to apprehend from the quantity or the complexity of the information which they contain." (Fekete and Lecolinet 2006).
 Many techniques of information visualization have been proposed over the fifteen last years, as is shown in the work to identify 6 leading causes of cognition amplification by visualization (Card, Mackinlay and Shneiderman 1999) (Shahar and Cheng 2000) or the visualization of temporal association rules (Rainsford and Roddick 2000).Analysis of relational information evolution is mainly based on the visualization of dynamic graphs.Many researchers developed display systems of networks (Di Battista et al. 1999), by taking into account a cartography of connectivity related to the Internet, the networks of phone calls, the networks of quotation as well as the progressive visualization of the evolutionary fields of knowledge.
We study the evolution of research themes, information visualization, according to Kapusova ( 2004), who combines aspects of scientific visualization, man-machine interfaces (human-computer interfaces), excavation of data (mining dated), imagery and graphs.For (Fekete and Lecolinet 2006) information visualization was detached from three related fields: the Manmachine interaction, the analysis statistics and cartography, but also scientific visualization.Thus, the distinction must be made between the visualization which refers to the process which leads to a chart and the interactive information chart which milked the means of interactions which use information charts.The user's role in the tools for data visualization is a subject of major concern (Grinstein 1996;Fayyad, Grinstein and Wierse 2002).Thus interactive visualization of relational data brings to the user an artificial substrate which transcribes a great amount of information.It makes function of support to its knowledge and its intuition to enable him or her to discover new relations, to help with decision making and to allow anticipation, as well as for the evolution of these data.In visual excavation of data, the interaction materializes the loop of feedback between the user and the visual aids (Keim and Kriegel 1996).A majority of visualization tools offer the access to powerful statistical methods of analysis but these methods are not really interactive and users can only seldom direct the totality of the chart.

Methodology
VisuGraph is a data visualization tool.It is developed in java (Loubier and Dousset 2008).The list of alliances is regarded as a document population.Two actors represented graphically in the form of nodes are considered concurrent if they are present in the same alliance (there can be more than two actors per alliance).The whole of these co-occurrences is counted in a square matrix crossing, two to two, as far as the actors are concerned.
The data are represented in a graph simple G characterized by two units: a unit V = {v 1 ,v 2 ,…,v n } whose elements are called tops, and a unit E = {e 1 ,e 2 ,…,e m }, left the unit of the parts with two elements of V, which are called edges.G will be noted G = (V, E).G is a graph not directed (there is no distinction between (u, v) and (v, u) for u and v in V) and as simple (there is no loop (v, v) in E and there exists more than one bond between two nodes).Recourse to visual artifices makes it possible to represent information as well as possible, by the means of a particular semiology on the level of tops form and colors used.In VisuGraph, data are represented in circle form where size is proportional to the value of the metric.Bonds are represented by segments binding two nodes, coded according to color.

a. Interactivity on semiology
Graphic visualization comprehension is based on construction rules of a symbolic system.Study of the signs and their significance is called semiology.It is also based on a codified use of the writings and on general aesthetic principles.Bertin (1970) is regarded as the initiator in term of cartography of information.He is interested in construction of visualization by graphic symbols.
Graphic semiology is based on:  significance of the drawings,  choice of legends, symbols, icons,  methodology to transmit a visual message.
Semiology quality goes with the possibility for the user to be able to control it fully.With regard to the variation of color, value is a variation of luminous intensity of darkest to the most clear, or conversely.It translates an order relation and differences (quantitative relation).
However, our capacity to be recognized is much more limited than our aptitude to be appreciated:  on the one hand, differential sensitivity of the eye to luminous energy is not directly proportional to the intensity of flow.The appreciation of the ranges is lower in clear colors than in the beds,  in addition, our differential chromatic sensitivity is not uniform and specific to each one.
 The tops and the edges being balanced, initially, the color makes it possible to code the value of the metric of each element of the graphs.The tops/edges of stronger metrics will be colored by strong intensity and conversely.
The user can accentuate contrasts of colors used.Thus, for coding by the edges color, a measurable rule is placed at the disposal of the user in order to enable him or her to attenuate or increase the intensity of the color.It is the same for labelling of the data which size and intensity can be regulated, thus making it possible to fully control the importance of this information.Indeed big size and strong intensity make visualization less readable than if they are of small size and homogeneous color with the prime coat of the representation.
In the following figure, five tops are extracted from a total graph.The initial police force used here as an example is based on small size and low intensity, in order not to deteriorate the legibility of the graph.The graph of the medium results comes from the increase in the value of the rules of graduation and of the intensity of the police force and the bonds between them, but also of the size of the police force.The graph of right-hand side results from an increase even more important in the values of these rules.

b. Attraction and répulsion forces
The base of any graph analysis is based on the clearness and the legibility of the representation.In the case of a no-planar graph, the number of edges crossings can quickly make the graph illegible and complicate its interpretation.Much work was carried with powerful algorithms of directed placement by the means of forces (FDP: "Force Directed Placement") (Tutte 1963;Eades 1984;Kamada and Kawai 1989;Fruchterman and Reingold 1991).
The algorithm FDP proposed in VisuGraph is based on Fruchterman and Reingold's (1991) work and makes it possible for the user to intervene on the application of the forces, by increasing or decreasing them by the skew of two scales, the attraction force or of repulsion.The attraction force between the tops can be proportional to the force of the bond between them.The attraction force between two tops υi and υj is given by: (1) The factor K is calculated according to a drawing surface and of the tops number.d uv is the distance between u and v in the drawing.
corresponds to the scale value for the attraction divided by two.It is used to define the attraction degree between two tops.K makes it possible to represent every edge in the representation window and not out of it.L represents the window length, L the width and nbtops corresponds to the number of visible tops of the graph. (2) If the tops u, v are not connected by an edge then ƒa (u, v) = 0. Repulsion force between two tops u, v is defined by: (3) corresponds to the value of the scale corresponding to the repulsion, allowing to interact on the repulsion; it makes it possible to define the repulsion degree between two tops u and v.
Thus, the higher the attraction threshold chosen by the user, the more the dependent tops attract themselves, supporting the total drawing of the graph structure to the detriment of inter-nodes relations.In the same way, the more the repulsion threshold is raised, the more the structure is widened, the non-dependent tops are pushed back and those united by an edge are more distinguished, allowing us to obtain more burst structure.
The combination of the parameter setting of these two forces leads to a more readable representation, as shown in the figure 2, where the graph ( 1) is a planar and on which no algorithm FDP was not applied; the graph ( 2) is the result of the application of our algorithm for which the rules of graduation for attraction and the repulsion were positioned, here with median values (5 for each one on scales from 1 to 10).

c. Transitivity
Navigation in the initial graph is often too complex.In order to carry out it, it is possible to work on a subgraph.We start from a particular top selected in the complete graph and gradually extend the graph by transitivity.This technique allows, by a change of x-ray, to concentrate on a relevant extract resulting from targeted information (actor, key word and concept).Measurements of degree centrality and constraint privilege the local point of view.More precisely, a data is known as a power station if it is strongly connected to the other members of the graph.The concept of centrality makes it possible to specify the dominant position of an actor, or a node in the network (Freeman 1979).We base our work on the algorithm of Floyd (1962).It is based on a generalization to the case of valued graphs by a calculation algorithm of graph transitive closing, discovered about simultaneously in France by Roy (1959) and in the United States by Warshall (1962).
The transitive closing of a graph G= (X, A) is the minimal transitive relation containing the relation (X, A), it acts of a graph G*= (X, A*) such as (x, y) ∈ A* if and only if there exists a way f in G beginning by x and ending by y.The calculation of transitive closing makes it possible to answer the questions concerning the existence of ways between x and y in G and this for any couple of tops (x, y).(X, A*) calculation is carried out by iteration of the basic operation ϕ x (A) which adds the arcs (y, z), and asks is a predecessor of x and z one of its successors.More formally: (4) Definition : For any top X, (5) For any couple of top (x, y), ( Transitive closing A* is given by: (7) In our contribution, the user selects a specific node in the graph.Other tops are masked and only the initially selected node remains visible.
By the means of a scale, the change of transitivity is carried out.The first step indicates the direct neighbours of the node, who are then visible, like the bond with the initial node.The more the value of the scale is increased, the more the threshold of transitivity is important.This study of the structure of the graph and in particular of the topology of a particular node makes it possible to qualify this last and to study its role within the whole of the studied population.If transitivity reveals many direct links towards other data this shows that the major importance of the data represented is revealed.Sub graph studies by transitivity make it possible to carry out a more pointed analysis and to detect the most important actors.However, it is interesting to compare the typology of several important graph actors in order to be able to compare them with the same element:  The number of bonds with direct neighbours (transitivity of first threshold) and it importance at the local level,  Importance of the basic data studied within the total graph.
 It is necessary to preserve an image of the first threshold of transitivity for each studied top.
We add to VisuGraph a functionality allowing preserving a precise image of the top transitivity at one specific moment in the form of a small independent window.It is located near the main visualization window.The comparison of different sub graphs structure resulting from transitivity makes it possible to distinguish remarkable elements.In the following figure, the studied data are authors having taken part in a scientific congress on the topic of strategic watch.In this article context, the data are used at the end to do illustration of interactive methods, not as a complete analysis of this congress.In order to facilitate the graph legibility, we reduced to the maximum the size of wording.Based on a global graph representation, we can see that stronger centrality data are distinguished and the sub graphs based on these data are extracted in order to be able to compare them.The higher graph corresponds to the total graph of the whole of the dataset.The interest is related to a node in particular which appears in the total graph which is strongly connected to the other tops.It is then interesting to calculate the transitive closing of this top and to study the structure of it.In this way, the top is isolated (3), by masking of the other tops.In the second time, using the change of graduation of the rule of transitivity, the direct neighbours of this top are obtained (4).The important number of direct connections of this node is remarkable, which means that the author is a paramount actor within his team and that he collaborates frequently with other researchers.By increasing the threshold of transitivity, we obtain the graph (5), representing the maximum threshold of transitivity for this top.Comparison between this visualization and the total graph reveals the similarity between the latter.The author initially selected can thus be qualified as being one of the elements at the origin of the total structure of the graph, i.e. a very important author.
Figure 2: Extraction starting from the total graph of a specific top and calculation of its transitivity.

d. Filtering
The filtering concept is based on the metric values.It consists in preserving only the tops and the edges of the associated graph with the values higher or equal to a threshold, according to a value fixed by the user via a scale.This procedure reveals the most representative tops, as well as the important components of the structure.The visualization of the result after filtering can be made by masking (total) adjacent edges and consequently their tops, having a value of metric lower than the threshold defined by the user.This kind of representation extracts the elements representative of the graph in terms of value of the metric (Loubier and Dousset 2008).

e. K-core
The decomposition in k-core (Batagelj and Zaversnik, 2002)  where n and m are respectively the number of nodes and edges in the network.Applied to VisuGraph, the k-core is calculated starting from a threshold fixed by the user, via a scale.The more this threshold increases, the higher the coreness is.The obtained graph corresponds to decomposition in k-core, according to the threshold value chosen, via a scale.
In the following figure, the k-core is applied for one k = 5, which means that only the nodes having at least five bonds are preserved.The results obtained allow:  To visualize the main actors having collaborated;  To distinguish the various teams, i.e. all the actors having to work more together.
Thus several different communities are distinguished, by the means of this method.It is noted that the global graph contains three large important teams.Within each one of these groups, one distinguishes the major actors from each team, who are in the middle of the connections and thus the absence would divide the team, such as for example the nodes circled in the graph of the bottom of figure 2.

Conclusion
In this article, we presented several interactive functionalities of the temporal data visualization tool, named VisuGraph.User point of views and user directives are taken into account, on the level of his or her needs but also on the level of its intervention when handling is dominant in the design of a tool of assistance to the analysis and for decision making.The tool and the visualization should not be fixed.The user must be free to be able to control fully his or her representation while intervening on the various statistical methods suggested, but also on the semiology suggested by the tool.Within the framework of this article, we insist on the statistical analysis functionalities suggested by the VisuGraph tool, such as:  The tops directed placements algorithm,  transitivity,  filtering,  k-core.
These various methods facilitate and improve exploration and graph structure analysis, like the particular study of specific nodes.Our proposal is suggested by the interactivity offered for these traditional statistics methods.This interactivity between user and system makes it possible to control fully the chart, by means of scales and thus to target its analysis in order to obtain more precision as for the structure of the graph.Thus, VisuGraph tool answers the principal ideas proposed by (Card, Mackinlay and Shneiderman 1999)

Figure 1 :
Figure 1: Interactivity on the semiology of the graph.
consists in identifying particular subsets of the graph called k-core.Consider a graph G = (V, E) with |V| = n tops and |E| = e edges.A k-core is defined as follows: Definition A subgraph H = G (C, E|C) armature by the subset C⊆V is a k-core or a core of order k if and only if and H is a maximum subset with this property.The related subset is characterized by a "coreness" c E .It forms a cluster (a community) within the meaning of (Alvarez and Al, 2005).The k-core is obtained by recursive pruning of the nodes which have a smaller degree than k.The graph remaining contains only tops of degree ≥ k.K-core decomposition makes it possible to obtain a hierarchical partitioning of the tops such as the whole of coreness 1 is in top of the hierarchy and the maximum whole of coreness is at the bottom of the hierarchy.This partition depends on the degree of each top and the degrees in the vicinity.Complexity in time of the algorithm of decomposition in a k-core of Alvarez-Hamelin et al. (2005) is O (n + m)


such as: Cognitive resources reduction via an interaction raised with user,  Representation in graphic form of great volumes of data,  Increase of structures detection by the way of precise and interactive statistical methods, Means of handling the data representation by semiology control. : (Shahar and Cheng 1999)s(Balmisse 2005).Many techniques of data visualization were proposed to date in various applications such as clinical data study(Shahar and Cheng 1999), geographical data (MacEachren et al. 1998), hydrometric data (Kramer and Jozsa 1998), personal data (such as those contained in a medical file) (Liking et al. 1996) and at various ends such as significant tendencies research (Harbour and Al 2000), exploration of programs traces (Renieris and Reiss 1999), data analysis of logs (Hochheiser and Schneiderman 2002), temporal abstractions representation