Here is the rewritten article in markdown format:
Tax Evasion Detection Method Based on Positive and Unlabeled Learning with Network Embedding Features
This research proposes a novel method for detecting tax evasion using positive and unlabeled learning with network embedding features. The proposed method combines graph neural networks, positive and unlabeled learning, and network embedding to achieve high accuracy in tax evasion detection.
Key Contributions
1. Novel Framework for Tax Evasion Detection
The research presents a novel framework for tax evasion detection using positive and unlabeled learning with network embedding features. This framework is designed to capture complex relationships between taxpayers and identify individuals who are likely to be engaged in tax evasion activities.
2. Graph Neural Networks for Node Representation Learning
The method utilizes graph neural networks (GNNs) for node representation learning, which enables the model to learn complex patterns and relationships between taxpayers. This approach allows the model to capture subtle differences in behavior that may indicate tax evasion.
3. Extension of Positive and Unlabeled Learning Method
The proposed method extends the existing positive and unlabeled learning (PU-Learning) method by incorporating network embedding features. This extension enables the model to leverage the strengths of both GNNs and PU-Learning, resulting in improved accuracy and robustness.
Evaluation Results
The evaluation results demonstrate that the proposed method outperforms existing methods in terms of accuracy and F1-score. The high performance of the proposed method indicates its effectiveness in detecting tax evasion.
Limitations and Future Work
1. Limited Generalizability
The dataset used in this research is from a specific country, which may limit the generalizability of the results to other countries. Future work can address this limitation by exploring more diverse datasets.
2. Feature Extraction Limitations
The feature extraction process relies on pre-trained word embeddings, which may not capture domain-specific knowledge. Future work can develop more sophisticated feature extraction methods that incorporate domain-specific knowledge.
References
- Liu et al. (2003) - Building text classifiers using positive and unlabeled examples.
- Menon et al. (2015) - Learning from corrupted binary labels via class-probability estimation.
- Mordelet et al. (2014) - A bagging SVM to learn from positive and unlabeled examples.
- Pérez López et al. (2019) - Tax fraud detection through neural networks: an application using a sample of personal income taxpayers.
- Perozzi et al. (2014) - Deepwalk: Online learning of social representations.
Acknowledgments
The authors acknowledge the support of various funding agencies and research initiatives, including the National Key Research and Development Program of China and the MOE Innovation Research Team.