正文部分
182 lines
\begin{abstract}
\begin{abstract}
Graph data analysis, particularly local triangle counting, plays a pivotal role in deciphering complex relationships within graph data. This method is invaluable across diverse fields such as social networks, transportation, and cybersecurity. However, this process often involves handling sensitive information, necessitating that the relationship between any two nodes should be considered private.
Graph data analysis, particularly local triangle counting, plays a pivotal role in deciphering complex relationships within graph data. This method is invaluable across diverse fields such as social networks, transportation, and cybersecurity. However, this process often involves handling sensitive information, necessitating that the relationship between any two nodes be considered private.
Differential privacy (DP) is a formal model to address the privacy concern and can be categorized into two kinds: the central DP (CDP) model, which achieves better result accuracy, and the local DP (LDP) model, which does not assume a trusted server.
Differential privacy (DP) is a formal model to address this privacy concern and can be categorized into two types: the central DP (CDP) model, which achieves better result accuracy, and the local DP (LDP) model, which does not assume a trusted server.
To bridge the gap between the two models, we propose {\textsf{Sectric}}, a \underline{se}rver-aided \underline{c}rypto-assisted \chadded{local} \underline{tri}angle \underline{c}ounting protocol, in this paper.
To bridge the gap between the two models, we propose {\textsf{Sectric}}, a \underline{se}rver-aided \underline{c}rypto-assisted \chadded{local} \underline{tri}angle \underline{c}ounting protocol, in this paper.
It can achieve the same result accuracy with the same privacy budget as the CDP model without assuming a trusted server.
It can achieve the same result accuracy with the same privacy budget as the CDP model without assuming a trusted server.
\textsf{Sectric} also explores a new approach in crypto-assisted graph data analysis algorithms that represents a node's neighbors using a set instead of \chreplaced{an}{a} adjacency vector, and successfully \chreplaced{achieves}{achieve} higher efficiency compared to other crypto-assisted solutions.
\textsf{Sectric} also explores a new approach in crypto-assisted graph data analysis algorithms that represents a node's neighbors using a set instead of \chreplaced{an}{a} adjacency vector, and successfully \chreplaced{achieves}{achieve} higher efficiency compared to other crypto-assisted solutions.
We also conduct theoretical and empirical evaluations to demonstrate that \textsf{Sectric} achieves the design principles.
We also conduct theoretical and empirical evaluations to demonstrate that \textsf{Sectric} achieves the design principles.
\end{abstract}
\end{abstract}
\section{Introduction}
\section{Introduction}
Graph data analysis is pivotal in unraveling complex relationships and patterns in graph data, making it a useful tool in various fields such as social networks~\cite{Backstrom2007WhereforeAT, Charbey2019StarsHO, Isaak2018UserDP, newman2009random}, transportation and logistics~\cite{Leskovec2007CosteffectiveOD, Pourhabibi2020FraudDA, Tan2019AGM, Weber2018ScalableGL}, and cybersecurity~\cite{Jernigan2009GaydarFF}.
Graph data analysis is pivotal in unraveling complex relationships and patterns in graph data, making it a useful tool in various fields such as social networks~\cite{Backstrom2007WhereforeAT, Charbey2019StarsHO, Isaak2018UserDP, newman2009random}, transportation and logistics~\cite{Leskovec2007CosteffectiveOD, Pourhabibi2020FraudDA, Tan2019AGM, Weber2018ScalableGL}, and cybersecurity~\cite{Jernigan2009GaydarFF}.
In many applications, graph data \chreplaced{is}{are} stored in a decentralized manner~\cite{becchetti2008efficient}, where each node knows its neighbors, while no central node has the full graph topology. In this setting, \emph{local triangle counting}, which calculates the triangle counts containing a given node, is a fundamental graph analysis task. It is crucial for many downstream applications. For example, in decentralized social networks, the nodes will first perform a local triangle count to calculate the local clustering coefficient~\cite{newman2009random,becchetti2010efficient,green2014fast,li2017clustering} or local transitivity ratio~\cite{schank2005approximating,al2018triangle}, which reflects its importance in the network.
In many applications, graph data \chreplaced{is}{are} stored in a decentralized manner~\cite{becchetti2008efficient}, where each node knows its neighbors, while no central node has the full graph topology. In this setting, \emph{local triangle counting}, which calculates the triangle counts containing a given node, is a fundamental graph analysis task. It is crucial for many downstream applications. For example, in decentralized social networks, the nodes will first perform a local triangle count to calculate the local clustering coefficient~\cite{newman2009random,becchetti2010efficient,green2014fast,li2017clustering} or local transitivity ratio~\cite{schank2005approximating,al2018triangle}, which reflects its importance in the network.
Privacy is another concern in the decentralized setting. Privacy-sensitive users \chreplaced{would not want}{will not hope} to reveal the identities of their neighbors to third parties. Differential privacy (DP) is a formal model addressing this privacy concern.
Privacy is another concern in the decentralized setting. Privacy-sensitive users \chreplaced{would not want}{will not hope} to reveal the identities of their neighbors to third parties. Differential privacy (DP) is a formal model addressing this privacy concern.
Prior differentially private solutions can be mainly categorized into two kinds: adapting the central DP (CDP) model or the local DP model (LDP).
Prior differentially private solutions can be mainly categorized into two types: adapting the central DP (CDP) model or the local DP model (LDP).
The CDP model assumes a trusted server to calculate the triangle counts and adds a small noise to the final result\footnote{In the CDP model, the server knows the complete and accurate topology of the graph, including all nodes and the connection edges to their neighbors.}. The LDP model eliminates the trust assumption on the server, but has to add more noise in the calculating process with the same privacy budget. Thus, the CDP model has better result accuracy and the LDP model aligns better with the decentralized setting.
The CDP model assumes a trusted server to calculate the triangle counts and adds a small noise to the final result\footnote{In the CDP model, the server knows the complete and accurate topology of the graph, including all nodes and the connection edges to their neighbors.}. The LDP model eliminates the trust assumption on the server, but has to add more noise in the calculating process with the same privacy budget. Thus, the CDP model has better result accuracy, and the LDP model aligns better with the decentralized setting.
To bridge the gap between the two models, we notice the emerging crypto-assisted approach in other graph analysis tasks and explore its application in privacy-preserving local triangle counting. A mainstream framework for adapting this approach is to make graph nodes encrypt their \chreplaced{adjacency}{adjacent} relationships and \chreplaced{send only}{only send} the ciphertexts to the assisting servers. Then, assisting servers interact with graph nodes using cryptographic primitives to finally obtain the analysis result. Thus, this approach does not require trusted assisting servers and provides better privacy protection than the LDP model\footnote{In the LDP model, graph nodes send their adjacent vector with other nodes with bounded noise to the server in the calculating process.}. The \chreplaced{servers}{server} can also have an accurate analysis result due to the use of cryptographic tools, which guarantees utility.
To bridge the gap between the two models, we notice the emerging crypto-assisted approach in other graph analysis tasks and explore its application in privacy-preserving local triangle counting. A mainstream framework for adapting this approach is to make graph nodes encrypt their \chreplaced{adjacency}{adjacent} relationships and \chreplaced{send only}{only send} the ciphertexts to the assisting servers. Then, assisting servers interact with graph nodes using cryptographic primitives to finally obtain the analysis result. Thus, this approach does not require trusted assisting servers and provides better privacy protection than the LDP model\footnote{In the LDP model, graph nodes send their adjacent vector with other nodes with bounded noise to the server in the calculating process.}. The \chreplaced{servers}{server} can also have an accurate analysis result due to the use of cryptographic tools, which guarantees utility.
Following this framework, we design \textsf{Sectric}, a \underline{se}rver-aided \underline{c}rypto-assisted \underline{tri}angle \underline{c}ounting protocol. The first challenge we meet is the high overheads incurred by directly applying \chreplaced{existing}{existed} techniques. For example, if we directly \chreplaced{adapt}{adapting} techniques from CARGO~\cite{liu2024cargo} or MAGO~\cite{wang2023mago} to privacy-preserving local triangle counting, it will bring $O(|\vdv|^2)$ overheads, which are \chreplaced{impractical for}{unsatisfactory on} large graphs. We observe that the root of these adapted solutions' overheads lies in operating on the graph's adjacency matrix.
Following this framework, we design \textsf{Sectric}, a \underline{se}rver-aided \underline{c}rypto-assisted \underline{tri}angle \underline{c}ounting protocol. The first challenge we meet is the high overheads incurred by directly applying \chreplaced{existing}{existed} techniques. For example, if we directly \chreplaced{adapt}{adapting} techniques from CARGO~\cite{liu2024cargo} or MAGO~\cite{wang2023mago} to privacy-preserving local triangle counting, it will bring $O(|\vdv|^2)$ overheads, which are \chreplaced{impractical for}{unsatisfactory on} large graphs. We observe that the root of these adapted solutions' overheads lies in operating on the graph's adjacency matrix.
To reduce the overheads, we explore \chreplaced{the usage of}{to use} adjacency sets, instead of adjacency vectors, to represent \chreplaced{node adjacencies}{node's neighbors}. To implement this idea, we design a novel cryptographic tool named \emph{Three-Party Private Set Membership Test (3PPSMT)}, rather than using secret sharing as CARGO and MAGO. This 3PPSMT primitive reduces the overheads of counting two graph nodes' common neighbors from $O(|\vdv|)$ to $O(d_{max})$ \chreplaced{compared}{comparing} to secret sharing. With this primitive, \textsf{Sectric} \chreplaced{introduces only}{only brings} $O(d_{max}|\vdv|)$ overheads in privacy-preserving local triangle counting. It makes \textsf{Sectric} more suitable for analyzing sparse graphs (i.e., $d_{max} = o(|\vdv|)$).
To reduce the overheads, we explore \chreplaced{the usage of}{to use} adjacency sets, instead of adjacency vectors, to represent \chreplaced{node adjacencies}{node's neighbors}. To implement this idea, we design a novel cryptographic tool named \emph{Three-Party Private Set Membership Test (3PPSMT)}, rather than using secret sharing as CARGO and MAGO. This 3PPSMT primitive reduces the overheads of counting two graph nodes' common neighbors from $O(|\vdv|)$ to $O(d_{max})$ \chreplaced{compared}{comparing} to secret sharing. With this primitive, \textsf{Sectric} \chreplaced{introduces only}{only brings} $O(d_{max}|\vdv|)$ overheads in privacy-preserving local triangle counting. It makes \textsf{Sectric} more suitable for analyzing sparse graphs (i.e., $d_{max} = o(|\vdv|)$).
Meanwhile, \textsf{Sectric} enables the querier to calculate the accurate counting result. It guarantees utility but may also reveal graph nodes' \chreplaced{adjacency}{adjacent} relationship. To prevent this privacy leakage in the \chreplaced{computation}{calculating} result, we also demonstrate that \textsf{Sectric} is also compatible with DP mechanism. \textsf{Sectric} allows to add noise subject to a given distribution, ensuring that the querier receives only the noisy result, while the server gains no information. The noise intensity is \chreplaced{the same as}{same to} that in the CDP model given the same privacy budget. In a nutshell, \textsf{Sectric} has the same result utility when providing the same privacy guarantee as the CDP model, and meanwhile requires no trusted server as the LDP model.
Meanwhile, \textsf{Sectric} enables the querier to calculate the accurate counting result. It guarantees utility but may also reveal graph nodes' \chreplaced{adjacency}{adjacent} relationship. To prevent this privacy leakage in the \chreplaced{computation}{calculating} result, we also demonstrate that \textsf{Sectric} is compatible with the DP mechanism. \textsf{Sectric} allows adding noise subject to a given distribution, ensuring that the querier receives only the noisy result, while the server gains no information. The noise intensity is \chreplaced{the same as}{same to} that in the CDP model given the same privacy budget. In a nutshell, \textsf{Sectric} has the same result utility when providing the same privacy guarantee as the CDP model, and meanwhile requires no trusted server as the LDP model.
Furthermore, we also fully utilize the local view of graph nodes to reduce the number of required servers. Prior crypto-assisted graph data analysis solutions require two or more non-collusive servers to assist in, while \textsf{Sectric} only requires one assisting server. This requirement is easier to implement in practical applications.
Furthermore, we also fully utilize the local view of graph nodes to reduce the number of required servers. Prior crypto-assisted graph data analysis solutions require two or more non-collusive servers to assist, while \textsf{Sectric} only requires one assisting server. This requirement is easier to implement in practical applications.
The main contributions of this work can be summarized as:
The main contributions of this work can be summarized as:
\begin{itemize}
\begin{itemize}
\item We design a novel local triangle counting protocol \textsf{Sectric} bridging the trust assumption and utility gap between the CDP and LDP models in the problem. Our solution shows that the same privacy and utility guarantee as the CDP can be achieved without requiring a trusted server.
\item We design a novel local triangle counting protocol \textsf{Sectric} bridging the trust assumption and utility gap between the CDP and LDP models in the problem. Our solution shows that the same privacy and utility guarantee as the CDP can be achieved without requiring a trusted server.
\item We explore a new approach in crypto-assisted graph data analysis algorithms that represents a node's neighbors using a set instead of an adjacency vector and reduces the overheads from $O(|\vdv|^2)$ to $O(d_{max}|\vdv|)$. We believe that this approach can also be adopted in other tasks to reduce the overheads and will further explore it in the future.
\item We explore a new approach in crypto-assisted graph data analysis algorithms that represents a node's neighbors using a set instead of an adjacency vector and reduces the overheads from $O(|\vdv|^2)$ to $O(d_{max}|\vdv|)$. We believe that this approach can also be adopted in other tasks to reduce the overheads and will further explore it in the future.
%\item We define a novel cryptographic primitive $\fdv_\textsf{3PPSMT}$ and propose the $\Pi_\textsf{3PPSMT}$ protocol, an efficient implementation of this primitive. This primitive can also be applied to other privacy-aware graph data analysis tasks to avoid intermediate privacy costs.
%\item We define a novel cryptographic primitive $\fdv_\textsf{3PPSMT}$ and propose the $\Pi_\textsf{3PPSMT}$ protocol, an efficient implementation of this primitive. This primitive can also be applied to other privacy-aware graph data analysis tasks to avoid intermediate privacy costs.
\item We perform a comprehensive theoretical and empirical analysis of \textsf{Sectric} to demonstrate its privacy guarantees and performance. We also adapt the open-source implementation of a state-of-the-art work~\cite{liu2024cargo,CARGO} to local triangle counting as the baseline.
\item We perform a comprehensive theoretical and empirical analysis of \textsf{Sectric} to demonstrate its privacy guarantees and performance. We also adapt the open-source implementation of a state-of-the-art work~\cite{liu2024cargo,CARGO} to local triangle counting as the baseline.
\end{itemize}
\end{itemize}
\stitle{Paper Organization.}
\stitle{Paper Organization.}
The rest of this paper is organized as follows: In Section \ref{section: related work}, we discuss the related works. Then, we introduce the necessary preliminaries of this work in Section \ref{section: preliminaries}. In Section \ref{section: 3PPSMT}, we define the primitive $\fdv_\textsf{3PPSMT}$ and propose the $\Pi_\textsf{3PPSMT}$ protocol implementing this primitive. Based on this primitive, the construction of the \textsf{Sectric} protocol is presented in Section \ref{section: sectric construction}. Section \ref{section: implementation and evaluation} presents the experimental results and Section \ref{section: conclusion} concludes this paper.
The rest of this paper is organized as follows: In Section \ref{section: related work}, we discuss the related works. Then, we introduce the necessary preliminaries of this work in Section \ref{section: preliminaries}. In Section \ref{section: 3PPSMT}, we define the primitive $\fdv_\textsf{3PPSMT}$ and propose the $\Pi_\textsf{3PPSMT}$ protocol implementing this primitive. Based on this primitive, the construction of the \textsf{Sectric} protocol is presented in Section \ref{section: sectric construction}. Section \ref{section: implementation and evaluation} presents the experimental results, and Section \ref{section: conclusion} concludes this paper.
\section{Related Work}
\section{Related Work}
\label{section: related work}
\label{section: related work}
\subsection{Privacy-Preserving Triangle Counting}
\subsection{Privacy-Preserving Triangle Counting}
The problem of privacy-preserving triangle counting has been an active area of research. Existing works can be broadly categorized into two groups based on whether they assume the existence of a trusted server or not: centralized model-based approaches and decentralized model-based approaches.
The problem of privacy-preserving triangle counting has been an active area of research. Existing works can be broadly categorized into two groups based on whether they assume the existence of a trusted server or not: centralized model-based approaches and decentralized model-based approaches.
The centralized model assumes the existence of a trusted server which holds the entire graph. Ding et al.~\cite{Ding2021DifferentiallyPT} achieve a balance between the accuracy of triangle counting and data privacy by selecting appropriate edge deletion strategies. Raskhodnikova et al.~\cite{Karwa2011PrivateAO, Nissim2007SmoothSA} use randomized strategies to ensure that the published triangle counts do not accurately allow inferring the existence of any particular edge, while Kasiviswanathan et al.~\cite{Kasiviswanathan2013AnalyzingGW} have achieved this by projecting the graph with a limited degree threshold. However, these approaches require a fully trusted and central server, which can introduce privacy issues in many applications.
The centralized model assumes the existence of a trusted server that holds the entire graph. Ding et al.~\cite{Ding2021DifferentiallyPT} achieve a balance between the accuracy of triangle counting and data privacy by selecting appropriate edge deletion strategies. Raskhodnikova et al.~\cite{Karwa2011PrivateAO, Nissim2007SmoothSA} use randomized strategies to ensure that the published triangle counts do not accurately allow inferring the existence of any particular edge, while Kasiviswanathan et al.~\cite{Kasiviswanathan2013AnalyzingGW} have achieved this by projecting the graph with a limited degree threshold. However, these approaches require a fully trusted and central server, which can introduce privacy issues in many applications.
The solutions proposed by these works can provide privacy in the existence of a fully trusted central server. However, in the scenarios where a trusted server is intractable to be implemented, their solutions cannot be directly applied.
The solutions proposed by these works can provide privacy in the presence of a fully trusted central server. However, in scenarios where a trusted server is intractable to implement, their solutions cannot be directly applied.
Thus, many works have proposed a decentralized model, where the node set is still considered public knowledge, but the relationship between two nodes is only known to them and treated as their privacy. Sun et al.~\cite{Sun2019AnalyzingSS} \chreplaced{propose}{proposed} a local differential privacy approach, where users locally perturb their adjacency vectors to protect the privacy of edges. But their assumption that users have an extended local view, allowing them to see their 2-hop neighbors, introduces the data correlation problem~\cite{Liu2022CollectingTC}, and is not applicable in most real-world cases. In the more realistic scenario where users can only see their immediate neighbors, Imola et al.~\cite{Imola2021CommunicationEfficientTC, Imola2020LocallyDP} utilize multiple rounds of interactions to upload and download perturbed edge information. While this approach can preserve privacy, it also introduces a non-negligible additive error, as highlighted by~\cite{Eden2023TriangleCW}.
Thus, many works have proposed a decentralized model, where the node set is still considered public knowledge, but the relationship between two nodes is only known to them and treated as their privacy. Sun et al.~\cite{Sun2019AnalyzingSS} \chreplaced{propose}{proposed} a local differential privacy approach, where users locally perturb their adjacency vectors to protect the privacy of edges. However, their assumption that users have an extended local view, allowing them to see their 2-hop neighbors, introduces the data correlation problem~\cite{Liu2022CollectingTC}, and is not applicable in most real-world cases. In the more realistic scenario where users can only see their immediate neighbors, Imola et al.~\cite{Imola2021CommunicationEfficientTC, Imola2020LocallyDP} utilize multiple rounds of interactions to upload and download perturbed edge information. While this approach can preserve privacy, it also introduces a non-negligible additive error, as highlighted by~\cite{Eden2023TriangleCW}.
In the decentralized model, crypto-assisted solutions to triangle counting are also emerging in recent years. CARGO~\cite{liu2024cargo} utilizes a hybrid approach that combines additive secret sharing and differential privacy, allowing two untrusted servers to only see encoded values beyond other information. This approach enables users to add smaller amounts of noise when implementing differential privacy, thereby achieving better utility compared to ~\cite{Imola2021CommunicationEfficientTC}. Building on a similar approach, Imola et al.~\cite{Imola2022DifferentiallyPT} introduce a trusted intermediate server with shuffling functionality. Another work, MAGO~\cite{wang2023mago}, which is based on lightweight secret sharing techniques, utilizes three servers from different trust domains working in coordination to improve the accuracy of triangle counting, and also detect whether malicious adversaries attempt to tamper with the statistical result.
In the decentralized model, crypto-assisted solutions to triangle counting are also emerging in recent years. CARGO~\cite{liu2024cargo} utilizes a hybrid approach that combines additive secret sharing and differential privacy, allowing two untrusted servers to only see encoded values beyond other information. This approach enables users to add smaller amounts of noise when implementing differential privacy, thereby achieving better utility compared to ~\cite{Imola2021CommunicationEfficientTC}. Building on a similar approach, Imola et al.~\cite{Imola2022DifferentiallyPT} introduce a trusted intermediate server with shuffling functionality. Another work, MAGO~\cite{wang2023mago}, which is based on lightweight secret sharing techniques, utilizes three servers from different trust domains working in coordination to improve the accuracy of triangle counting and also detect whether malicious adversaries attempt to tamper with the statistical result.
A summary of the comparison between \textsf{Sectric} and other related works on triangle counting are presented in Table \ref{tab: related works}.
A summary of the comparison between \textsf{Sectric} and other related works on triangle counting is presented in Table \ref{tab: related works}.
\subsection{Crypto-Assisted Graph Analytics}
\subsection{Crypto-Assisted Graph Analytics}
Crypto-assisted solutions are also emerging in other graph analytics tasks.
Crypto-assisted solutions are also emerging in other graph analytics tasks.
Some works enable users to securely contribute their local views on a decentralized social graph for spectral analytics. Sharma et al.~\cite{Sharma2019PrivateGraphPS} utilize homomorphic encryption to protect the privacy of graph edges, allowing distributed data owners to interact with cloud-based programs while keeping their data private from the cloud service provider. PrivGED~\cite{Wang2022PrivacyPreservingAO} employs secret sharing to encrypt the elements in local view vectors, enabling privacy-preserving eigen-decomposition analytics over decentralized social graphs while safeguarding users' social relationships. However, these studies focus on different analytical tasks than our work on \chadded{local} triangle counting.
Some works enable users to securely contribute their local views on a decentralized social graph for spectral analytics. Sharma et al.~\cite{Sharma2019PrivateGraphPS} utilize homomorphic encryption to protect the privacy of graph edges, allowing distributed data owners to interact with cloud-based programs while keeping their data private from the cloud service provider. PrivGED~\cite{Wang2022PrivacyPreservingAO} employs secret sharing to encrypt the elements in local view vectors, enabling privacy-preserving eigen-decomposition analytics over decentralized social graphs while safeguarding users' social relationships. However, these studies focus on different analytical tasks than our work on \chadded{local} triangle counting.
Another line of research leverages cryptographic techniques for privacy-preserving epidemiological analysis, such as analyzing transmission chains or clusters to predict infection rates using contact data stored on mobile devices. RIPPLE~\cite{Gunther2022PosterPE, Holz2020PEMPE} enables realistic simulations on the actual person-to-person social contact graph, utilizing a set of semi-honest non-colluding MPC servers to facilitate communication among participants. Colo~\cite{Liu2024MakingPF} introduces a protocol that guards against malicious device behavior using random masks, efficient commitments, and range proofs, ensuring that devices only learn their own node, edge, and topology data, while the analyst only learns the query result. However, these methods are not directly applicable to our \chadded{local} triangle counting problem.
Another line of research leverages cryptographic techniques for privacy-preserving epidemiological analysis, such as analyzing transmission chains or clusters to predict infection rates using contact data stored on mobile devices. RIPPLE~\cite{Gunther2022PosterPE, Holz2020PEMPE} enables realistic simulations on the actual person-to-person social contact graph, utilizing a set of semi-honest non-colluding MPC servers to facilitate communication among participants. Colo~\cite{Liu2024MakingPF} introduces a protocol that guards against malicious device behavior using random masks, efficient commitments, and range proofs, ensuring that devices only learn their own node, edge, and topology data, while the analyst only learns the query result. However, these methods are not directly applicable to our \chadded{local} triangle counting problem.
There is also a research direction focusing on collaborative graph analytics, where each client possesses a local subgraph with multiple nodes and edges. Araki et al.~\cite{Araki2021SecureGA} propose a secure shuffling method for a 3-server setting with an honest majority, implementing algorithms like breadth-first search and maximal independent set. Guan et al.~\cite{Guan2023EfficientAP} design a scheme for two data owners to jointly respond to a subgraph matching query without disclosing their graph datasets to each other. FEAT~\cite{liu2024federated} has a central server that collects subgraph data from clients using private set union, aggregates them into a noisy global graph, and performs triangle counting. Oryx~\cite{ZhongOryxP} can detect cycles of various lengths on a multi-party federated graph while preserving topological secrecy. Pang et al.~\cite{PANG2025103952} design a scheme based on structured encryption and private set intersection cardinality techniques. They provide server tokens to queriers to query the butterfly counts of specific nodes or edges. However, the assumptions in these studies differ from our scenario, where users only have a local perspective.
There is also a research direction focusing on collaborative graph analytics, where each client possesses a local subgraph with multiple nodes and edges. Araki et al.~\cite{Araki2021SecureGA} propose a secure shuffling method for a 3-server setting with an honest majority, implementing algorithms like breadth-first search and maximal independent set. Guan et al.~\cite{Guan2023EfficientAP} design a scheme for two data owners to jointly respond to a subgraph matching query without disclosing their graph datasets to each other. FEAT~\cite{liu2024federated} has a central server that collects subgraph data from clients using private set union, aggregates them into a noisy global graph, and performs triangle counting. Oryx~\cite{ZhongOryxP} can detect cycles of various lengths on a multi-party federated graph while preserving topological secrecy. Pang et al.~\cite{PANG2025103952} design a scheme based on structured encryption and private set intersection cardinality techniques. They provide server tokens to queriers to query the butterfly counts of specific nodes or edges. However, the assumptions in these studies differ from our scenario, where users only have a local perspective.
\section{Preliminaries}
\section{Preliminaries}
\label{section: preliminaries}
\label{section: preliminaries}
We first introduce some notations used in this paper. For a positive integer $N$, $[N]$ denotes the set $\setpresentation{1, 2, \dots, N}$\chdeleted{ for a given positive integer N}, and $\mathbb{Z}_N$ represents the group modulo $N$. Given a set $\xdv$, $x \xleftarrow{\$} \xdv$ indicates that $x$ is uniformly selected from $\xdv$. A real number ensemble $\setpresentation{p_\lambda}_{\lambda \in \mathbb{N}}$ is negligible if for any polynomial $p$, $p_\lambda \leq \frac{1}{p(\lambda)}$ for all sufficiently large $\lambda$.
We first introduce some notations used in this paper. For a positive integer $N$, $[N]$ denotes the set $\setpresentation{1, 2, \dots, N}$\chdeleted{ for a given positive integer N}, and $\mathbb{Z}_N$ represents the group modulo $N$. Given a set $\xdv$, $x \xleftarrow{\$} \xdv$ indicates that $x$ is uniformly selected from $\xdv$. A real number ensemble $\setpresentation{p_\lambda}_{\lambda \in \mathbb{N}}$ is negligible if, for any polynomial $p$, $p_\lambda \leq \frac{1}{p(\lambda)}$ for all sufficiently large $\lambda$.
\subsection{Problem Definition}
\subsection{Problem Definition}
\label{subsection: problem definition}
\label{subsection: problem definition}
\stitle{Local Triangle Counting.}
\stitle{Local Triangle Counting.}
Let $\gdv = (\vdv, \edv)$ be a graph, where $\vdv$ and $\edv$ represent the set of nodes and edges, respectively. Two nodes $u, v \in \vdv$ are adjacent if $(u, v) \in \edv$. We consider undirected graphs, so for any two nodes $u, v \in \vdv$, $(u, v) \in \edv$ if and only if $(v, u) \in \edv$. Without loss of generality, we assume the nodes in $\vdv$ are indexed from $1$ to $|\vdv|$.
Let $\gdv = (\vdv, \edv)$ be a graph, where $\vdv$ and $\edv$ represent the set of nodes and edges, respectively. Two nodes $u, v \in \vdv$ are adjacent if $(u, v) \in \edv$. We consider undirected graphs, so for any two nodes $u, v \in \vdv$, $(u, v) \in \edv$ if and only if $(v, u) \in \edv$. Without loss of generality, we assume the nodes in $\vdv$ are indexed from $1$ to $|\vdv|$.
We define the notion of local triangle sets in Definition \ref{definition: local triangle set}. Intuitively speaking, the local triangle set $\Delta_u$ of a node $u$ is the set of all triangles containing $u$. With the notion of local triangle set, the {local triangle counting} problem can be stated as calculating $|\Delta_u|$ given a graph $\gdv = (\vdv, \edv)$ and a node $u \in \vdv$.
We define the notion of local triangle sets in Definition \ref{definition: local triangle set}. Intuitively speaking, the local triangle set $\Delta_u$ of a node $u$ is the set of all triangles containing $u$. With the notion of local triangle set, the {local triangle counting} problem can be stated as calculating $|\Delta_u|$ given a graph $\gdv = (\vdv, \edv)$ and a node $u \in \vdv$.
\begin{definition}[Local triangle set]
\begin{definition}[Local triangle set]
\label{definition: local triangle set}
\label{definition: local triangle set}
Given a graph $\gdv = (\vdv, \edv)$, the local triangle set of a node $u \in \vdv$ is defined as:
Given a graph $\gdv = (\vdv, \edv)$, the local triangle set of a node $u \in \vdv$ is defined as:
$$
$$
\Delta_u = \setpresentation{\setpresentation{u, v, w} \subset \vdv: (u, v) , (u, w) , (v, w) \in \edv}.
\Delta_u = \setpresentation{\setpresentation{u, v, w} \subset \vdv: (u, v) , (u, w) , (v, w) \in \edv}.
$$
$$
\end{definition}
\end{definition}
We also define the notion of a graph node's neighbor set in Definition \ref{definition: neighbor set}. The neighbor set $N_u$ of a node $u$ denotes the set of all nodes adjacent to $u$.
We also define the notion of a graph node's neighbor set in Definition \ref{definition: neighbor set}. The neighbor set $N_u$ of a node $u$ denotes the set of all nodes adjacent to $u$.
\begin{definition}[Neighbor set]
\begin{definition}[Neighbor set]
\label{definition: neighbor set}
\label{definition: neighbor set}
Given a graph $\gdv = (\vdv, \edv)$, we define the neighbor set of a node $u \in \vdv$ as
Given a graph $\gdv = (\vdv, \edv)$, we define the neighbor set of a node $u \in \vdv$ as
$$
$$
N_u = \setpresentation{v \in \vdv: (u, v) \in \edv}.
N_u = \setpresentation{v \in \vdv: (u, v) \in \edv}.
$$
$$
\end{definition}
\end{definition}
The following theorem establishes the relationship between the local triangle sets and the neighbor sets, which serves as the foundation for our \textsf{Sectric} protocol.
The following theorem establishes the relationship between the local triangle sets and the neighbor sets, which serves as the foundation for our \textsf{Sectric} protocol.
\begin{theorem}
\begin{theorem}
\label{theorem: relation between neighbor and local triangle}
\label{theorem: relation between neighbor and local triangle}
Given a graph $\gdv = (\vdv, \edv)$, for any node $u \in \vdv$, we have
Given a graph $\gdv = (\vdv, \edv)$, for any node $u \in \vdv$, we have
$$
$$
|\Delta_u| = \frac{1}{2} \sum_{v \in N_u} |N_u \cap N_v|.
|\Delta_u| = \frac{1}{2} \sum_{v \in N_u} |N_u \cap N_v|.
$$
$$
\end{theorem}
\end{theorem}
\begin{proof}\color{blue}
\begin{proof}\color{blue}
We know that the number of triangles containing node $u$ is given by:
We know that the number of triangles containing node $u$ is given by:
$$
$$
|\Delta_u| = \sum_{v \in N_u} \sum_{w \in N_u, w > v} I(v, w),
|\Delta_u| = \sum_{v \in N_u} \sum_{w \in N_u, w > v} I(v, w),
$$
$$
where $I(v, w)$ is an indicator function that equals 1 if there is an edge between $v$ and $w$, and 0 otherwise.
where $I(v, w)$ is an indicator function that equals 1 if there is an edge between $v$ and $w$, and 0 otherwise.
Additionally, we have:
Additionally, we have:
$$
$$
|N_u \cap N_v| = \sum_{w \in N_u} I(v, w).
|N_u \cap N_v| = \sum_{w \in N_u} I(v, w).
$$
$$
Thus, we can express the sum as:
Thus, we can express the sum as:
$$
$$
\sum_{v \in N_u} |N_u \cap N_v| = \sum_{v \in N_u} \sum_{w \in N_u} I(v, w).
\sum_{v \in N_u} |N_u \cap N_v| = \sum_{v \in N_u} \sum_{w \in N_u} I(v, w).
$$
$$
In the calculation process, the positions of $v$ and $w$ are equivalent and interchangeable. Therefore, we can conclude that:
In the calculation process, the positions of $v$ and $w$ are equivalent and interchangeable. Therefore, we can conclude that:
$$
$$
|\Delta_u| = \sum_{v \in N_u} \sum_{w \in N_u, w > v} I(v, w) = \frac{1}{2} \sum_{v \in N_u} |N_u \cap N_v|.
|\Delta_u| = \sum_{v \in N_u} \sum_{w \in N_u, w > v} I(v, w) = \frac{1}{2} \sum_{v \in N_u} |N_u \cap N_v|.
$$
$$
\end{proof}
\end{proof}
\stitle{System Model.}
\stitle{System Model.}
We design \textsf{Sectric} in a server-aided paradigm, where a server $\sdv$ is introduced. The \textsf{Sectric} system model is illustrated in Fig. \ref{figure: Sectric system example}. The system participants include the graph nodes and the server $\sdv$. The left side of the figure depicts the graph, where the solid lines represent the adjacency relations between nodes (i.e., the edges). The right side shows the server $\sdv$. The dashed lines between the graph nodes and $\sdv$ indicate the communication model, where every node has a communication channel with the server $\sdv$.
We design \textsf{Sectric} in a server-aided paradigm, where a server $\sdv$ is introduced. The \textsf{Sectric} system model is illustrated in Fig. \ref{figure: Sectric system example}. The system participants include the graph nodes and the server $\sdv$. The left side of the figure depicts the graph, where the solid lines represent the adjacency relations between nodes (i.e., the edges). The right side shows the server $\sdv$. The dashed lines between the graph nodes and $\sdv$ indicate the communication model, where every node has a communication channel with the server $\sdv$.
The \textsf{Sectric} protocol is designed in the context of decentralized graph data. Supposing the graph is represented as $\gdv = (\vdv, \edv)$, we model this decentralized setting such that the number of graph nodes $|\vdv|$ is public knowledge, and each graph node $u \in \vdv$ is only aware of its neighbor set $N_u$ (ref. Definition \ref{definition: neighbor set}). The server $\sdv$ has no knowledge about the graph's topology (i.e., the edges).
The \textsf{Sectric} protocol is designed in the context of decentralized graph data. Supposing the graph is represented as $\gdv = (\vdv, \edv)$, we model this decentralized setting such that the number of graph nodes $|\vdv|$ is public knowledge, and each graph node $u \in \vdv$ is only aware of its neighbor set $N_u$ (ref. Definition \ref{definition: neighbor set}). The server $\sdv$ has no knowledge about the graph's topology (i.e., the edges).
In \textsf{Sectric}, a graph node $Q$ acts as the querier and requests the number of its local triangles. During the protocol execution, the querier $Q$ interacts with the server $\sdv$ and its neighbors. Finally, the protocol outputs the number of $Q$'s local triangles to $Q$.
In \textsf{Sectric}, a graph node $Q$ acts as the querier and requests the number of its local triangles. During the protocol execution, the querier $Q$ interacts with the server $\sdv$ and its neighbors. Finally, the protocol outputs the number of $Q$'s local triangles to $Q$.
\stitle{Threat Model and Privacy Constraint.}
\stitle{Threat Model and Privacy Constraint.}
In this work, we consider a commonly adopted semi-honest model. This model assumes that the graph nodes and the server $\sdv$ will follow the \textsf{Sectric} protocol, but may attempt to extract additional knowledge about a given node's neighbor set from their view (ref. Definition \ref{definition: protocol participants' view}). In \textsf{Sectric}, we allow graph nodes to collude, but require that the server does not collude with any graph node.
In this work, we consider a commonly adopted semi-honest model. This model assumes that the graph nodes and the server $\sdv$ will follow the \textsf{Sectric} protocol, but may attempt to extract additional knowledge about a given node's neighbor set from their view (ref. Definition \ref{definition: protocol participants' view}). In \textsf{Sectric}, we allow graph nodes to collude, but require that the server does not collude with any graph node.
\begin{definition}[Participants' View]
\begin{definition}[Participants' View]
\label{definition: protocol participants' view}
\label{definition: protocol participants' view}
Suppose a multiparty computation protocol $\Pi$ is executed by a set of participants $P_1$, \dots, $P_n$. The view of $P_i$ in the execution is defined as $$VIEW_i = \setpresentation{r_i, \mdv_i},$$
Suppose a multiparty computation protocol $\Pi$ is executed by a set of participants $P_1$, \dots, $P_n$. The view of $P_i$ in the execution is defined as $$VIEW_i = \setpresentation{r_i, \mdv_i},$$
where $r_i$ is the randomness used by $P_i$ in the execution and $\mdv_i$ is all the messages it receives.
where $r_i$ is the randomness used by $P_i$ in the execution and $\mdv_i$ is all the messages it receives.
\end{definition}
\end{definition}
The privacy constraint considered by \textsf{Sectric} is to protect the privacy of graph nodes in the local triangle counting task, which mainly refers to their adjacency relations in the decentralized graph setting. In other words, a graph node $u$'s private information is its neighbor set $N_u$, which should not be revealed to other graph nodes or the server $\sdv$. We refer to this privacy requirement as ``neighbor privacy''.
The privacy constraint considered by \textsf{Sectric} is to protect the privacy of graph nodes in the local triangle counting task, which mainly refers to their adjacency relations in the decentralized graph setting. In other words, a graph node $u$'s private information is its neighbor set $N_u$, which should not be revealed to other graph nodes or the server $\sdv$. We refer to this privacy requirement as ``neighbor privacy''.
Therefore, the privacy of \textsf{Sectric} guarantees that for any subset of graph nodes $\vdv' \subset \vdv$ and all $u \in \vdv \setminus \vdv'$, the joint view of $\vdv'$ will not reveal any knowledge about $u$'s neighbor set $N_u$. It is also guaranteed that the server $\sdv$ will have no knowledge about any node $u$'s neighbor set $N_u$ during the execution of \textsf{Sectric}.
Therefore, the privacy of \textsf{Sectric} guarantees that for any subset of graph nodes $\vdv' \subset \vdv$ and all $u \in \vdv \setminus \vdv'$, the joint view of $\vdv'$ will not reveal any knowledge about $u$'s neighbor set $N_u$. It is also guaranteed that the server $\sdv$ will have no knowledge about any node $u$'s neighbor set $N_u$ during the execution of \textsf{Sectric}.
\begin{definition}[Privacy Constraint]
\begin{definition}[Privacy Constraint]
\label{definition: sectric privacy}
\label{definition: sectric privacy}
\textsf{Sectric} is a privacy-preserving protocol if there exist polynomial-time algorithms $\adv_{\sdv}$, $\adv_Q$ and $\adv_{\vdv'}$ such that the view of $\sdv$, the view of the querier $Q$, and the joint view of graph nodes $\vdv' \subset \vdv$ in the execution of \textsf{Sectric} can be simulated by $\adv_{\sdv}(1^\lambda, 1^{|\vdv|})$, $\adv_Q(1^\lambda, 1^{|\vdv|})$, and $\adv_{\vdv'}(1^\lambda, 1^{|\vdv|})$, respectively, where $\lambda$ is the security parameter. Here, ``view being simulated by an algorithm'' means that the view and the algorithm's output are computationally indistinguishable.
\textsf{Sectric} is a privacy-preserving protocol if there exist polynomial-time algorithms $\adv_{\sdv}$, $\adv_Q$, and $\adv_{\vdv'}$ such that the view of $\sdv$, the view of the querier $Q$, and the joint view of graph nodes $\vdv' \subset \vdv$ in the execution of \textsf{Sectric} can be simulated by $\adv_{\sdv}(1^\lambda, 1^{|\vdv|})$, $\adv_Q(1^\lambda, 1^{|\vdv|})$, and $\adv_{\vdv'}(1^\lambda, 1^{|\vdv|})$, respectively, where $\lambda$ is the security parameter. Here, ``view being simulated by an algorithm'' means that the view and the algorithm's output are computationally indistinguishable.
\end{definition}
\end{definition}
\subsection{Cryptographic Tools}
\subsection{Cryptographic Tools}
In the following, we introduce the cryptographic tools used in this work.
In the following, we introduce the cryptographic tools used in this work.
\stitle{Random oracles.}
\stitle{Random oracles.}
A random oracle is a theoretical construct used in the context of cryptography and complexity theory. It is defined by specifying query and image domains, and it responds with an element from the image domain for every query in the query domain.
A random oracle is a theoretical construct used in the context of cryptography and complexity theory. It is defined by specifying query and image domains, and it responds with an element from the image domain for every query in the query domain.
A random oracle $\hdv$ responds with a uniformly random value for every newly appeared query, and a fixed value for repeated queries. Random oracles are typically implemented using cryptographic hash functions, such as SHA-2 or SHA-3.
A random oracle $\hdv$ responds with a uniformly random value for every newly appeared query, and a fixed value for repeated queries. Random oracles are typically implemented using cryptographic hash functions, such as SHA-2 or SHA-3.
\stitle{Multi-point OPRF (mpOPRF).}
\stitle{Multi-point OPRF (mpOPRF).}
A pseudorandom function (PRF) family is a cryptographic tool which emulates a random function family. This notion is formalized in Definition \ref{definition: pseudorandom function}. Given a uniform seed $s$, a pseudorandom function is computationally indistinguishable with a truely random function. It is usually implemented with symmetric encryption algorithms, such as AES, in practice.
A pseudorandom function (PRF) family is a cryptographic tool that emulates a random function family. This notion is formalized in Definition \ref{definition: pseudorandom function}. Given a uniform seed $s$, a pseudorandom function is computationally indistinguishable from a truly random function. It is usually implemented with symmetric encryption algorithms, such as AES, in practice.
\begin{definition}[PRF family]
\begin{definition}[PRF family]
A function family $\setpresentation{f_s}_{s \in \bin^*}$, where $f_s$ is a map $\bin^{|s|} \rightarrow \bin^{|s|}$ for all $s \in \bin^*$, is a PRF family if it satisfies the following conditions:
A function family $\setpresentation{f_s}_{s \in \bin^*}$, where $f_s$ is a map $\bin^{|s|} \rightarrow \bin^{|s|}$ for all $s \in \bin^*$, is a PRF family if it satisfies the following conditions:
\begin{enumerate}
\begin{enumerate}
\item There exists a polynomial-time algorithm $F$ such that $F(s, x) = f_s(x)$ for all $\lambda \in \mathbb{N}$ and all $s, x \in \bin^\lambda$.
\item There exists a polynomial-time algorithm $F$ such that $F(s, x) = f_s(x)$ for all $\lambda \in \mathbb{N}$ and all $s, x \in \bin^\lambda$.
\item For any probabilistic polynomial-time algorithm $\adv$, $$|Pr[\adv^{f_s(\cdot)}(1^\lambda) | s \xleftarrow{\$} \bin^{\lambda}] - Pr[\adv^{f(\cdot)}(1^\lambda) | f \xleftarrow{\$} \fdv_{\lambda}]|$$
\item For any probabilistic polynomial-time algorithm $\adv$, $$|Pr[\adv^{f_s(\cdot)}(1^\lambda) | s \xleftarrow{\$} \bin^{\lambda}] - Pr[\adv^{f(\cdot)}(1^\lambda) | f \xleftarrow{\$} \fdv_{\lambda}]|$$
is negligible, where $\adv^{f(\cdot)}$ denotes $\adv$ can access oracle $f$ and $f \xleftarrow{\$} \fdv_{\lambda}$ denotes $f$ is a truly random function mapping $\bin^\lambda \rightarrow \bin^\lambda$.
is negligible, where $\adv^{f(\cdot)}$ denotes that $\adv$ can access oracle $f$ and $f \xleftarrow{\$} \fdv_{\lambda}$ denotes that $f$ is a truly random function mapping $\bin^\lambda \rightarrow \bin^\lambda$.
\end{enumerate}
\end{enumerate}
\label{definition: pseudorandom function}
\label{definition: pseudorandom function}
\end{definition}
\end{definition}
Given a PRF family $\setpresentation{f_s}$, a mpOPRF protocol involves two parties and realizes the functionality $\fdv_\textsf{mpOPRF}$ defined in Algorithm \ref{algorithm: multi-point OPRF}. The sender and receiver specify a key $k$ and inputs $\vectorpresentation{x_i}_{i \in [n]}$, respectively. $\fdv_\textsf{mpOPRF}$ allows the receiver to learn the outcomes $\vectorpresen
Given a PRF family $\setpresentation{f_s}$, an mpOPRF protocol involves two parties and realizes the functionality $\fdv_\textsf{mpOPRF}$ defined in Algorithm \ref{algorithm: multi-point OPRF}. The sender and receiver specify a key $k$ and inputs $\vectorpresentation{x_i}_{i \in [n]}$, respectively. $\fdv_\textsf{mpOPRF}$ allows the receiver to learn the outcomes $\vectorpresentation